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(57) Abstract 



The invention provides a method for selecting, from a repertoire of polypeptides, a population of functional polypeptides which bind 
a target ligand in a first binding site and a generic ligand in a second binding site, which generic ligand is capable of binding functional 
members of the repertoire regardless of target ligand specificity, comprising the steps of: a) contacting the repertoire with the generic 
ligand and selecting functional polypeptides bound thereto; and b) contacting the selected functional polypeptides with the target ligand and 
selecting a population of polypeptides which bind to the target ligand. The invention accordingly provides a method by which a polypeptide 
repertoire is preselected, according to functionality as determined by the ability to bind the generic ligand, and the subset of polypeptides 
obtained as a result of such preselection is then employed for further selection according to the ability to bind the target ligand. 
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METHOD TO SC31EEN PHAGE DISPLAY UBRARIES WITH DIFFERENT LIGANDS 

The present invention relates to methods for selecting repenoires of polypeptides using 
5 generic and target ligands. In particular, the invention describes a method for selecting 
repertoires of antibody polypeptides with generic ligand to isolate functional subsets 
thereof. 

Introduction 

10 

The antigen binding domain of an antibody comprises two separate regions: a heavy 
chain variable domain (Vh) and a light chain variable domain (Vl: which can be either 
Vk or Vx). The antigen binding site itself is formed by six polypeptide loops: three 
from Vh domain (HI, H2 and H3) and three from Vl domain (LI, L2 and L3). A 

15 diverse primary repertoire of V genes that encode the Vh and Vl domains is produced 
by the combinatorial rearrangement of gene segments. The Vh gene is produced by the 
recombination of three gene segments, Vh. D and Jh- In humans, there are 
approximately 51 functional Vh segments (Cook and Tomlinson (1995) Immunol 
Today, 16: 237), 25 functional D segments (Corbett et aL (1997) 7. Mol. BioL, 268: 

20 69) and 6 functional Jh segments (Ravetch et aL (1981) Cell, 27: 583), depending on 
the haplotype. The Vh segment encodes the region of the polypeptide cham which 
forms the first and second antigen binding loops of the Vh domain (HI and H2), whilst 
the Vh, D and Jh segments combine to form the third antigen binding loop of the Vh 
domain (H3). The Vl gene is produced by the recombination of only two gene 

25 segments, Vl and Jl. In humans, there are approximately 40 functional Vk segments 
(Schable and Zachau (1993) BioL Chem. Hoppe-Seyler, 374: 1001). 31 functional V;^ 
segments (WiUianis et aL (1996) 7. MoL BioL, 264: 220; Kawasaki et aL (1997) 
Genome Res., 7: 250), 5 ftmctional Jk segments (Hieter et aL (1982) 7. BioL Chem., 
257: 1516) and 4 functional Jx segments (Vasicek and Leder (1990) 7. Exp, Med., 172: 

30 609). depending on the haplotype. The Vl segment encodes the region of the 
polypeptide chain which forms the first and second antigen binding loops of the Vl 
domain (LI and L2), whilst the Vl and Jl segments combine to form the third antigen 
bindmg loop of the Vl domain (L3). Antibodies selected from this primary repertoire 
are believed to be sufficiently diverse to bind almost all antigens with at least moderate 

35 affinity. High affinity antibodies are produced by "affinity maturation" of the 
rearranged genes, in which point mutations are generated and selected by the immune 
system on the basis of improved binding. 
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Analysis of the structures and sequences of antibodies has shown that five of the six 
antigen binding loops (Hi. H2. LI, L2. L3) possess a limited number of main-chain 
conformations or canonical structures (Chothia and Lesk (1987) J. Mol Biol 196- 
901; Chothia et aL (1989) Nature, 342: 877). The main-chain conformations are 
determmed by (i) the length of the antigen binding loop, and (u) panicular residues or 
types of residue, at certain key position in the antigen binding loop and the antibody 
framework. Analysis of the loop lengths and key residues has enabled us to the predict 
the main-chain conformations of HI. H2. LI. L2 and L3 encoded by the majority of 
human antibody sequences (Chothia et al. (1992) 7. Mol. Biol., 227: 799- Tomlinson et 
al. (1995) EMBO J., 14: 4628; WiUiams et al. (1996) J. Mol. Biol., 264: 220) 
Although the H3 region is much more diverse in terms of sequence, length and 
structure (due to the use of D segments), it also forms a limited number of main-chain 
conformations for short loop lengths which depend on the length and the presence of 
particular residues, or types of residue, at key positions in the loop and the antibody 
framework (Martin et al. (1996) J. Mol. Biol., 263: 800; Shirai et al. (1996) FEBS 
Letters, 399: 1). 

A similar analysis of side-chain diversity in human antibody sequences has enabled the 
separation of the pattern of sequence diversity in the primary repertoire from that 
created by somatic hypermutation. It was found that the two patterns are 
complementary: diversity in the primary repertoire is focused at the centre of the 
antigen binding whereas somatic hypermutation spreads diversity to regions at the 
periphery that are highly conserved in the primary repenoire (Tomlinson et al. (1996) 
J. Mol. Biol.. 256: 813; Ignatovich et al. (1997) J. Mol. Biol, 268- 69) This 
complementarity seems to have evolved as an efficient strategy for searching sequence 
space, given the limited number B cells available for selection at any given time Thus 
antibodies are first selected from the primary repertoire based on diversity at the centre 
of the binding site. Somatic hypermutation is then left to optimise residues at the 
periphery without disrupting favourable interactions established during the primary 
response. 

The recent advent of phage-display technology (Smith (1985) Science, 228- 1315- Scott 
and Smith (1990) Science, 249: 386; McCafferty et al. (1990) Nature, 348: 552) has 
enabled the in vitro selection of human antibodies against a wide range of target 
antigens from "single pot" libraries. These phage-antibody libraries can be grouped into 
two categories: natural libraries which use rearranged V genes harvested from human B 
cells (Marks et al. (1991) /. Mol. Biol., 222: 581; Vaughan et al. (1996) Nature 
Biotech., 14: 309) or synthetic libraries whereby germline V gene segments are 
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•rearranged* in vitro (Hoogenboom & Winter (1992) 7. Mol. BioL, 227: 381; Nissim et 
al (1994) £M50 J., 13: 692; Griffiths et al. (1994) EMBO J,, 13: 3245; De Kruif er 
al. (1995) J. Mol. BioL, 248: 97) or where synthetic CDRs are incorporated into a 
single rearranged V gene (Barbas et aL (1992) Proc. NatL Acad. Sci. USA, 89: 4457). 
5 Although synthetic libraries help to overcome the inherent biases of the natural 
repertoire which can limit the effective size of phage libraries constructed from 
rearranged V genes, they require the use of long degenerate TCR primers which 
frequently introduce base-pair deletions into the assembled V genes. This high degree 
of randomisation may also lead to the creation of antibodies which are unable to fold 
10 correctly and are also therefore non-functipnal. Furthermore, antibodies selected from 
these libraries may be poorly expressed and, in many cases, will contain framework 
mutations that may effect the antibodies immunogenic ity when used in human therapy. 

Recently, in an extension of the synthetic library approach it has been suggested 
15 (WO97/08320, Morphosys) that human antibody frameworks can be pre-optimised by 
synthesising a set of 'master genes* that have consensus framework sequences and 
incorporate amino acid substitutions shown to improve folding and expression. 
Diversity in the CDRs is then incorporated using oligonucleotides. Since it is desirable 
to produce artificial human antibodies which will not be recognised as foreign by the 
20 human immune system, the use of consensus frameworks which, in most cases, do not 
correspond to any natural framework is a disadvantage of this approach. Furthermore, 
since it is likely that the CDR diversity will also have an effect on folding and/or 
expression, it is preferable to optimise the folding and/or expression (and remove any 
frame-shifts or stop codons) after the V gene has been fully assembled. To this end, it 
25 would be desirable to have a selection system which could eliminate non-functional or 
poorly folded/expressed members of the library before selection with the target antigen 
is carried out. 

A further problem with the libraries of the prior art is that, because the main-chain 
30 conformation is heterogeneous, three-dimensional strucmral modelling is difficult 
because suitable high resolution crystallographic data may not be available. This is a 
particular problem for the H3 region, where the vast majority of antibodies derived 
from natural or synthetic antibody libraries have medium length or long loops and 
therefore cannot be modelled. 

35 
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Summary of the Invention 

According to the first aspect of the present invention, there is provided a method for 
selectmg, from a repertoire of polypeptides, a population of functional polypeptides 
which bmd a target ligand in a first binding site and a generic Ugand in a second binding 
site, which generic ligand is capable of binding functional members of the repertoire 
regardless of target Hgand specificity, comprising the steps of: 

a) contacting the repertoire with the generic ligand and selecting fimctional 
polypeptides bound thereto; and 

b) contacting the selected fimctional polypeptides with the target ligand and 
selecting a population of polypeptides which bind to the target Ugand. 



The mvention accordingly provides a method by which a repenoire of polypeptides is 
preselected, according to fimctionality as determined by the ability to bind the generic 
hgand, and the subset of polypeptides obtained as a result of preselection is then 
employed for fiirther rounds of selection according to the ability to bmd the target 
hgand. Although, in a preferred embodiment, the repertoire is first selected with tiie 
generic ligand. it wUl be apparent to one skilled in the art that the repertoire may be 
contacted with the ligands in tiie opposite order, i.e. witii the target ligand before the 
20 generic ligand. 

The invention permits the person skilled in the art to remove, from a chosen repertoire 
of polypeptides, those polypeptides which are non-fimctional. for example as a result of 
the introduction of frame-shift mutations, stop codons, folding mutants or expression 

25 mutants which would be or are incapable of binding to substantially any target ligand. 
Such non-fimctional mutants are generated by the normal randomisation and variation 
procedures employed m the construction of polypeptide repertoires. At tiie same time 
tiie mvention permits tiie person skilled in tiie an to enrich a chosen repertoire of 
polypeptides for tiiose polypeptides which are fimctional. well folded and highly 

30 expressed. 

Preferably, two or more subsets of polypeptides are obtained from a repertoire by tiie 
metiiod of tiie invention, for example, by prescreening tiie repenoire with two or more 
generic ligands. or by contacting tiie repertoire witii the generic ligand(s) under 
different conditions. Advantageously, flie subsets of polypeptides tiius obtained are 
combmed to form a fiinher repertoire of polypeptides, which may be fiirther screened 
by contactmg witii target and/or generic ligands. 
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Preferably, the library according to the invention comprises polypeptides of the 
immunoglobulin superfamily, such as antibody polypeptides or T-cell receptor 
polypeptides. Advantageously, the library may comprise individual immunoglobulin 
domains, such as the Vh or Vl domains of antibodies, or the or domains of T- 
5 cell receptors. In a preferred embodiment, therefore, repertoires of, for exai/iple, Vh 
and V^. polypeptides may be individually prescreened using a generic ligand and then 
combined to produce a functional repertoire comprising both Vh and Vl polypeptides. 
Such a repertoire can then be screened with a target ligand in order to isolate 
polypeptides comprising both Vh and V^ domains and having the desired binding 
10 specificity. 

In an advantageous embodiment, the generic ligand selected for use with 
immunoglobulin repertoires is a superantigen. Superantigens are able to bind to 
functional immimoglobulin molecules, or subsets thereof comprising particular main- 
15 chain conformations, irrespective of target ligand specificity. Alternatively, generic 
ligands may be selected from any ligand capable of binding to the general structure of 
the polypeptides which make up any given repertoire, such as antibodies themselves, 
metal ion matrices, organic compounds including* proteins or peptides, and the like, 

20 In a second aspect, the invention provides a library wherein the functional members 
have binding sites for both generic and target ligands. Libraries may be specifically 
designed for this purpose, for example by constru^^ting antibody libraries having a 
main-chain conformation which is recognised by a given superantigen, or by 
constructing a library in which substantially all potentially functional members possess 

25 a structure recognisable by a antibody ligand. 

In a third aspect, the invention provides a method for detecting, immobilising, 
purifying or immunoprecipitating one or more members of a repertoire of polypeptides 
previously selected according to the invention, comprising binding the members to the 
30 generic ligand. 

In a fourth aspect, the invention provides a library comprising a repertoire of 
polypeptides of the inmiunoglobulin superfamily, wherein the members of the 
repertoire have a known main-chain conformation. 

35 

In a fifth aspect, the invention provides a method for selecting a polypeptide having a 
desired generic and/or target ligand binding site from a repertoire of polypeptides, 
comprising the steps of: 
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a) expressing a library according to the preceding aspects of the invention- 

b) contacting the polypeptides with generic and/or target ligands and selecting 
those which bind the generic and/or target ligand; and 

c) optionally amplifying the selected polypeptide(s) which bind the generic 
and/or target ligand. 

d) optionally repeating steps a) - c). 

Repertoires of polypeptides are advantageously both generated and maintained in the 
form of a nucleic acid library. Therefore, in a sixth aspect, the invention provides a 
nucleic acid library encoding a repertoire of such polypeptides. 

Brief Description of the Figures 

Figure 1: Bar graph indicating positions in the Vh and Vk regions of the hUman 
antibody repenoire which exhibit extensive naniral diversity and make antigen contacts 
(see Tomlmson et al. (1996) 7. Mol. Biol., 256: 813). The H3 and the end of L3 are 
not shown m this representation although they are also highly diverse and make antigen 
contacts. Although sequence diversity in the human lambda genes has been thoroughly 
characterised (see Ignatovich et al. (1997) J. Mol. Biol. 268: 69) very litUe data on 
antigen contacts currently exists for three-dimensional lambda structures. 

Figure 2: Sequence of the scFv that forms the basis of a library according to the 
mvenuon. There are currenUy two versions of the library: a "primary" library wherem 
18 positions are varied and a "somatic" library wherein 12 positions are varied. The six 
loop regions Hi. H2. H3. LI. L2 and L3 are indicated. CDR regions as defined by 
Kabat (Kabat et al. (1991). Sequences of proteins of immunological interest, U S 
Department of Health and Human Services) are underlined. 

Kgure 3: Analysis of functionality in a library according to the invention before and 
after selecting with the generic ligands Protein A and Protein L. Here Protein L is 
coated on an ELISA plate, the scFv supematants are bound to it and detection of scFv 
bmdmg IS with Protein A-HRP. Therefore, only those scFv capable of binding both 
l^otem A and Protein L give an ELISA signal. 

Figure 4: Sequences of clones selected from libraries according to the invention, after 
panmng with bovine ubiquitin. rat BIP. bovine histone. NIP-BSA. FITC-BSA. human 
leptm. human thyroglobulin. BSA. hen egg lysozyme, mouse IgG and human IgG 
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Underlines in the sequences indicate the positions which were varied in the respective 
libraries. 

Figure 5: 5a: Comparison of scFv concentration produced by the unselected and 
5 preselected "primary" DVT libraries in host cells. 5b: standard curve of EUSA as 
determined from known standards. 

Figure 6: Western blot of phage from preselected and unselected DVT "primary" 
libraries, probed with an anti-phage pIII antibody in order to determine the percentage 
10 of phage bearing scFv. 

DetaUed Description of the Invention 
Definitions 

15 

Repertoire A repertoire is a population of diverse variants, for example nucleic acid 
variants which differ in nucleotide sequence or polypeptide variants which differ in 
amino acid sequence. A library according to the invention will encompass a repertoire 
of polypeptides or nucleic acids. According to the present invention, a repertoire of 
20 polypeptides is designed to possess a binding site for a generic ligand and a binding site 
for a target ligand. The binding sites may overlap, or be located in the same region of 
the molecule, but their specificities will differ. 

Organism As used herein, the term "organism" refers to all cellular life-forms, 
25 such as prokaryotes and eukaryoies, as well as non-cellular, nucleic acid-containing 
entities, such as bacteriophage and viruses. 

Functional As used herein, the term "functional" refers to a polypeptide which 
possesses either the native biological activity of the naturally-produced proteins of its 

30 type, or any specific desired activity, for example as judged by its ability to bind to 
ligand molecules, defined below. Examples of "functional" polypeptides include an 
antibody binding specifically to an antigen through its antigen-binding site, a receptor 
molecule (e.g. a T-cell receptor) binding its characteristic ligand and an enzyme binding 
to its substrate. In order for a polypeptide to be classified as functional according to the 

35 invention, it follows that it first must be properly processed and folded so as to retain 
its overall structural integrity, as judged by its ability to bind the generic ligand, also 
defined below. 
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For the avoidance of doubt, functionality is not equivalent to the ability to bind the 
target hgand. For instance, a functional anti-CEA monoclonal antibody wUl not be able 
to bind specifically to target ligands such as bacterial LPS. However, because it is 
capable of bmding a target ligand (i.e. it would be able bind to CEA if CEA were the 
target hgand) it is classed as a "fimctional" antibody molecule and may be selected by 
bmdmg to a generic ligand, as defined below. Typically, non-functional antibody 
molecules will be incapable of binding to any target ligand. 

Generic Ugand A generic ligand is a ligand that binds a substantial proportion of 
functional members in a given repertoire. Thus, the same generic ligand can bind many 
members of the repertoire regardless of their target ligand specificities (see below) In 
general, the presence of functional generic ligand binding site indicates that the 
repertoire member is expressed and folded correctly. Thus, bindmg of the generic 
hgand to Its binding site provides a method for preselecting functional polypeptides 
13 from a repertoire of polypeptides. 

Target Ligand The target ligand is a ligand for which a specific binding member or 
members of the repertoire is to be identified. Where the members of the repertoire are 
antibody molecules, the target ligand may be an antigen and where the members of the 
20 repertoire are enzymes, the target ligand may be a substrate. Binding to the target 
ligand IS dependent upon both the member of the repertoire being functional, as 
described above under generic ligand, and upon the precise specificity of die binding 
site for the target ligand. 

25 Subset The subset is a part of the repertoire. In the terms of the present invention, it is 
often the case that only a subset of the repertoire is functional and therefore possesses a 
fiinctional generic ligand binding site. Furthennore. it is also possible that only a 
fraction of the functional members of a repertoire (yet significanUy more than would 
bind a given target ligand) will bind the generic ligand. These subsets are able to be 

3U selected according to the invention. 

Subsets of a library may be combined or pooled to produce novel repertoires which 
have been preselected according to desired criteria. Combined or pooled repertoires 
may be simple mixtures of the polypeptide members preselected by generic ligand 
35 bmdmg. or may be manipulated to combme two polypeptide subsets. For example V„ 
and V, polypeptides may be individually prescreened. and subsequently combined at 
die geneuc level onto single vectors such that they are expressed as combined V„-V, 
dimers, such as scFv. ^ h* ^ l 
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* Library The term library refers to a mixture of heterogeneous polypeptides or nucleic 
acids. The library is composed of members, which have a single polypeptide or nucleic 
acid sequence. To this extent, library is synonymous with repertoire. Sequence 
differences between library members are responsible for the diversity present in the 
library. The library may take the form of a simple mixmre of pol)rpeptides or nucleic 
acids, or may be in the form organisms or cells, for example bacteria, viruses, animal 
or plant cells and the like, transformed with a library of nucleic acids. Preferably, each 
individual organism or cell contains only one meniber of the library. Advantageously, 
the nucleic acids are incorporated into expression vectors, in order to allow expression 
of the polypeptides encoded by the nucleic acids. In a preferred aspect, therefore, a 
library may take the form of a population of host organisms, each organism containing 
one or more copies of an expression vector containing a single member of the library in 
nucleic acid form which can be expressed to produce its corresponding polypeptide 
member. Thus, the population of host organisms has the potential to encode a large 
repertoire of genetically diverse polypeptide variants. 

Immunoglobulin superfamily This refers to a family of polypeptides which retain the 
immunoglobulin fold characteristic of immunoglobulin (antibody) molecules, which 
20 contains two p sheets and, usually, a conserved disulphide bond. Members of the 
immunoglobulin superfamily are involved in many aspects of cellular and non-cellular 
interactions in vivo, including widespread roles in the immune system (for example, 
antibodies, T-cell receptor molecules and the like), involvement in cell adhesion (for 
example the ICAM molecules) and intracellular signalling (for example, receptor 
25 molecules, such as the PDGF receptor). The present invention is applicable to all 
immunoglobulin superfamily molecules, since variation therein is achieved in similar 
ways. Preferably, the present invention relates to inununoglobulins (antibodies). 

Main-chain conformation The main-chain conformation refers to the Ca backbone 
30 trace of a structure in three-dimensions. When individual hypervariable loops of 
antibodies or TCR molecules are considered the main-chain conformation is 
synonymous with the canonical strucmre. As set forth in Chothia and Lesk (1987) 7. 
MoL Biol., 196: 901 and Chothia et aL (1989) Nature, 342: 877, antibodies display a 
limited number of canonical structures for five of their six hypervariable loops (HI, 
35 H2, LI, L2 and L3), despite considerable side-chain diversity in the loops themselves. 
The precise canonical structure exhibited depends on the length of the loop and the 
identity of certain key residues involved in its packing. The sixth loop (H3) is much 
more diverse in both length and sequence and therefore only exhibits canonical 
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Structures for certain short loop lengths (Martin et al. (1996) J. Mol Biol 263- 800- 
Shirai et al (1996) FEBS Utters, 399: 1). In the present invention, all six loops will' 
preferably have canonical structures and hence the main-chain conformation for the 
ennre antibody molecule will be known. 

Antibody polypeptide Antibodies are immunoglobulins that are produced by B cells 
and form a central part of the host immune defence system in vertebrates. An antibody 
polypeptide, as used herein, is a polypeptide which either is an antibody or is a part of 
an annbody, modified or unmodified. Thus, the tenn antibody polypeptide includes a 
heavy cham. a light chain, a heavy chain-light chain dimer. a Fab fragment, a F(ab')2 
fragment, a Dab fragment, or an Fv fragment, including a single chain Fv (scFv) 
Methods for the constmction of such antibody molecules are well known in the art. 

Superantigen Superantigens are antigens, mostly in the fonn of toxins expressed in 
bacteria, which interact with members of the immunoglobulin superfamily outside the 
conventional ligand bmding sites for these molecules. Staphylococcal enterotoxms 
interact with T-cell receptors and have the effect of stimulating CD4-h T-cells 
Superantigens for antibodies include the molecules Protein G Uiat binds the IgG 
constant region (Bjorck and Kronvall (1984) J. Immunol, 133: 969; Reis et al (1984) 
/. Immunol., 132: 3091). Protein A that binds the the IgG constant region and the Vh 
domam (Forsgren and Sjoquisi (1966) J. Immunol.. 97: 822) and Protein L that binds 
tiie Vl domain (Bjorck (1988) J. Immunol., 140: 1994). 

Preferred Embodiments of the Invention 

The present invention provides a selection system which eliminates (or significantly 
reduces the proportion of) non-functional or poorly folded/expressed members of a 
polypeptide library whilst enriching for functional, folded and well expressed members 
before a selection for specificity against a "target ligand" is carried out. A repertoire of 
polypeptide molecules is contacted witii a "generic ligand", a protein that has affinity 
for a strticmral feamre common to all functional, for example complete and/or correctiy 
foMed, proteins of the relevant class. Note that the tenn "ligand" is used broadly in 
reference to molecules of use in tiie present invention. As used herein, tiie tenn 
ligand" refers to any entity tiiat will bmd to or be bound by a member of die 
polypeptide library. 

A significant number of defective proteins presem in the initial repertoire fail to bind 
the generic ligand and are thereby eliminated. This selective removal of non-functional 
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polypeptides from a library results in a marked reduction in its actual size, while its 
functional size is maintained, with a corresponding increase in its quality. Polypeptides 
which are retained by virtue of binding the generic ligand constitute a~ * first selected 
poor or 'subset' of the original repertoire. Consequently, this 'subset' is enriched for 
5 . functional, well folded and well expressed members of the initial repertoire. 

The polypeptides of the first selected pool or subset are subsequently contacted with at 
least one "target ligand**, which binds to polypeptides with a given functional 
specificity, Such target ligands include, but are not limited to, either half of a 

10 receptor/ligand pair (e.g. a hormone or other cell-signalling molecule, such as a 
neurotransmitter, and its cognate receptor), either of a binding pair of cell adhesion 
molecules, a protein substrate that is bound by the active site of an enzyme, a jprotein, 
peptide or small organic compound against which a particular antibody is to be directed 
or even an antibody itself. Consequently, the use of such a library is less labour- 

15 intensive and more economical, in terms of both time and materials, than is that of a 
conventional library. In addition, since, compared to a repertoire which has not been 
selected with a generic ligand, the first selected pool will contain a much higher ratio of 
molecules able to bind the target ligand to those that are unable to bind the target 
ligand, there will be a significant reduction of background during selection with the 

20 "target ligand". 

Combinatorial selection schemes are also contemplated according to the invention. 
Multiple selections of the same initial polypeptide repertoire can be performed in 
parallel or in series using different generic and/or target ligands. Thus, the repertoire 

25 can first be selected with a single generic ligand and then subsequently selected in 
parallel using different target ligands. The resulting subsets can then be used separately 
or combined, in which case the combined subset will have a range of target ligand 
specificities but a single generic ligand specificity. Alternatively, the repertoire can first 
be selected with a single target ligand and then subsequently selected in parallel using 

30 different generic ligands. The resulting subsets can then be used separately or 
combined, in which case the combined subset will have a range of generic ligand 
specificities but a single target ligand specificity. The use of more elaborate schemes 
are also envisaged. For example, the initial repenoire can be subjected to two rounds of 
selection using two different generic ligands, followed by selection with the target 

35 ligand. This produces a subset in which all members bind both generic ligands and the 
target ligand. Alternatively, if the selection of the initial repertoire with the two generic 
ligands is performed in parallel and the resulting subsets combined and then selected 
with the target ligand the resulting subset binds at least one of the two generic ligands 



BNSDCX^tD: <WO 992074aAl_L> 



10 



15 



20 



25 



30 



35 



WO 99/20749 

PCT/GB98/0313S 

12 

and the target ligand. Combined or pooled repertoires may be simple mixtures of the 
subsets or may be manipulated to physically link the subsets. For example, Vh and V 
polypeptides may be individually selected in parallel by binding two different generic 
Iigands. and subsequentiy combined at the genetic level onto single vectors such that 
they are expressed as combined V„-V,. This repenoire can then be selected against the 
target hgand such that the selected members able to bind both generic ligands and the 
target ligand. 

The invention encompasses libraries of fimctional polypepUdes selected or selectable by 
the methods broadly described above, as well as nucleic acid libraries encoding 
polypeptide molecules which may be used m a selection performed according to these 
methods (preferably, molecules which comprise a first binding site for a target ligand 
and a second bmding site for a generic ligand). In addition, the invention provides 
methods for detecting, immobilising, purifying or umnunoprecipitating one or more 
members of a repertoire of functional polypeptides selected using the generic or target 
ligands according to the invention. 

■Die invention is particularly applicable to the enrichment of libraries of molecules of 
the mununoglobulin superfamily. This is particularly trtie as regards the generation of 
populations of antibodies and T-cell receptors which are functional and have a desired 
specificity, as is required for use in diagnostic, therapeutic or prophylactic procedures 
To this end. the invention provides antibody and T-cell receptor libraries wherein all 
the members have both naniral frameworks and loops of known main-chain 
conformation, as well as strategies for useful mutagenesis of the starting sequence and 
the subsequent selection of functional variants so generated. Such polypeptide libraries 
may comprise V„ or Vp domains or, alternatively, it may comprise or Va domains 
or even both Vh or Vp and Vl or Va domains. 

There is significant need in the art for improved libraries of antibody or T-cell receptor 
molecules. For example, despite progress in the creation of "smgle pot" phage-antibody 
libraries, several problems still remain. Natural libraries (Marks et al. (1991) J Mol 
Biol., 222: 581; Vaughan et al. (1996) Naxure Biotech.. 14: 309) which use rearranged 
V genes harvested from human B cells are highly biased due to the positive and 
negative selection of tiie B cells /„ vivo. This can limit the effective size of phage 
libraries constnicted from rearranged V genes. In addition, clones derived from namral 
libraries invariably contain framework mutations which may effect the antibodies 
^nogenicity when used in human therapy. Syntiietic libraries (Hoogenboom & 
Wmter (1992) J. Mol. Biol., 227: 381; Barbas et al. (1992) Proc. Nazi. Acad Sci 
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USA, 89: 4457; Nissim et al. (1994) EMBOJ., 13: 692; Griffiths et al. (1994) EMBO 
13: 3245; De Kniif et aL (1995) /. MoL BioL, 248: 97) can overcome the problem 
of bias but they require the use of long degenerate PGR primers which frequently 
introduce base-pair deletions into the assembled V genes. This high degree of 

5 randomisation may also lead to the creation of antibodies which are unable to fold 
correctly and are also therefore non-functional. In many cases it is likely that these non- 
functional members will outoumber the functional members in a library. Even if the 
frameworks can be pre-optimised for folding and/or expression (WO97/08320, 
Morphosys) by synihesising a set of 'master genes* with consensus framework 

10 sequences and by incorporating amino acid substitutions shown to improve folding and 
expression, there remains the problem of immunogenicity since, in most cases, the 
consensus sequences do not correspond to any natural framework. Furthermore, since it 
is likely that the CDR diversity will also have an effect of folding and/or expression, it 
is preferable to optimise the folding and/or expression (and remove any frame-shifts or 

15 stop codons) after the V gene has been ftiUy assembled. 

A further problem with existing libraries is that because the main-chain conformation is 
heterogeneous, three-dimensional structural modelling is difficult because suitable high 
resolution crystallographic data may not be available. This is a particular problem for 
20 the H3 region, where the vast majority of antibodies derived from n^ral or synthetic 
antibody libraries have medium length or long loops and therefore cannot be modelled. 

Another problem with existing libraries is the reliance on epitope tags (such as the myc, 
FLAG or HIS tags) for detection of expressed antibody fragments. As these are usually 

25 located at the N or C terminal ends of the antibody fragment they tend to be prone to 
proteolytic cleavage. Superantigens, such as Protein A and Protein L can be used to 
detect expressed antibody fragments by binding the folded domains themselves but 
since they are Vh and family specific, only a relatively small proportion of members 
of any existing antibody library will bind one of these reagents and an even smaller 

30 proportion will bind to both. 

To this end, it would be desirable to have a selection system which could eliminate (or 
at least reduce the proportion of) non-functional or poorly folded/expressed members of 
the library before selection against the target antigen is carried out whilst enriching for 
35 functional, folded and well expressed members all of which are able to bind generic 
ligands such as the superantigens Protein A and Protein L. In addition, it would be 
advantageous to construct an antibody library wherein all the members have natural 
frameworks and have loops with known main-chain conformations. 
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The invention accordingly provides a method by which a polypeptide repertoire may be 
selected to remove non-functional members. This results in a marked reduction in the 
actual library size (and a corresponding increase in the quality of the Ubrary) without 
reducmg the functional library size. The invention also provides a method for creating 
new polypepude repertoires wherein aU the functional members are able to bmd a given 
generic hgand. The same generic ligand can be used for the subsequent detection 
immobilisation, purification or immmioprecipitation of any one or more members of the 
repertoire. 

Any 'naive' or 'immune' antibody repertoire can be used with, the present invention to 
ennch for functional members and/or to enrich for members that bind a given generic 
ligand or ligands. Indeed, since only a small percentage of all human germline Vr 
segments bind Protein A with high affinity and only a small percentage of all human 
germlme Vl segments bind Protein L with high affinity preselection with these 
superanngens is highly advantageous. Alternatively, pre-selection with via the epitope 
tag enables non-functional variants to be removed from synthetic libraries. The libraries 
that are amenable to preselection include, but are not limited to. libraries comprised of 
V genes rearranged in vivo of the type described by Marks et al. (1991) J. MoL Biol 
221: 581 and Vaughan et al. (1996) Nature Biotech., 14: 309, synthetic libraries 
whereby germlme V gene segments are 'rearranged' in vitro (Hoogenboom & Winter 
(1992) J. MoL Biol., 227: 381; Nissim et al. (1994) EMBO J., 13: 692; Griffiths et al 
(1994) EMBO J., 13: 3245; De Kniif et al. (1995) J. MoL BioL, 248: 97) or where 
synthetic CDRs are incorporated into a single rearranged V gene (Barbas et al (1992) 
Proc. NatL Acad. Sd. USA, 89: 4457) or into multiple master frameworks 
(WO97/08320, Morphosys). 

Selection of polypeptides according m t he invi-minq 

Once a diverse pool of polypeptides is generated, selection according to the invention is 
applied. Two broad selection procedures are based upon the order in which the generic 
and target ligands are applied; combmatorial variations on these schemes involve the 
use of multiple generic and/or target ligands in a given step of a selection. When a 
combinatorial scheme is used, the pool of polypeptide molecules may be contacted 
with, for example, several target ligands at once, or by each singly, in series; m the 
latter case, the resulting selected pools of polypeptides may be kept separate or may 
themselves, be pooled. These selection schemes may be summarized as follows- 
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a. Selection procedure 1 : 

Initial polypeptide selecti on using the generic ligand 

In order to remove non-fttnctional members of the library, a generic ligand is 
5 selected, such that the generic ligand is only bound by functional molecules. For 
example, the generic ligand may be a metallic ion, an antibody (in the form of a 
monoclonal antibody or a polyclonal mixture of antibodies), half of an enzyme/ligand 
complex or organic material; note that ligands of any of these types are, additionally or 
alternatively, of use as target ligands according to the invention. Antibody production 

10 and metal affinity chromatography are discussed in detail below. Ideally, these ligands 
bind a site (e.g. a peptide tag or superantigen binding site) on the members of the 
library which is of constant strucmre or sequence, which structure is liable to be absent 
or altered in non-functional members. In the case of antibody libraries, this method is 
of use to select from a library only those functional members which have a binding site 

15 for a given superantigen or monoclonal antibody; such an approach is useful in 
selecting functional antibody polypeptides from both natural and synthetic pools 
thereof. 

The superantigens Protein A and/or Protein L are of use in the invention as generic 
20 ligands to select antibody repertoires, since they bind correctly folded V„ and Vl 
domains (which belong to certain Vh and families), respectively, regardless of the 
sequence and structure of the binding site for the target ligand. In addition. Protein A 
or another superantigen Protein G are of use as generic ligands to select for folding 
and/or expression by binding the heavy chain constant domains of antibodies. Anti-K 
25 and anti-X antibodies are also of use in selecting light chain constant domams. Small 
organic mimetics of antibodies or of other binding proteins, such as Protein A (Li et aL 
(1998) Nature Biotech., 16: 190), are also of use. 

When this selection procedure is used, the generic ligand, by its very namre, is able to 
30 bind all functional members of the preselected repertoire; therefore, this generic ligand 
(or some conjugate thereof) may be used to detect, immobilise, purify or 
inmiunoprecipitate any member or population of members from the repertoire (whether 
selected by binding a given target ligand or not, as discussed below). Protein detection 
via immunoassay techniques as well as immunoprecipitation of member polypeptides of 
35 a repertoire of the invention may be performed by the techniques discussed below with 
regard to the testing of antibody selection ligands of use in the invention (see 
"Antibodies for use as ligands in polypeptide selection"). Immobilization may be 
performed through specific binding of a polypeptide member of a repertoire to either a 
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generic or target ligand according to the invention which is. itself, linked to a solid or 
senn-sohd support, such as a filter (e.g. of nitrocellulose or nylon) or a 
chromatographic support (including, but not limited to. a cellulose, polymer, resin or 
sUica support); covalent attachment of the member polypeptide to the generic or target 
ligand may be performed using any of a number of chemical crosslinking agents known 
to one of skai in the art. Immobilization on a metal affinity chromatography support is 
described below (see "Metallic ligands as use for the selection of polypeptides") 
Purification may comprise any or a combination of these techniques, in particular 
mununoprecipitation and chromatography by methods well known in the art. 

Using this approach, selection with multiple generic ligands can be performed either 
one after another to create a repertoire in which aU members bmd two or more generic 
hgands. separately in parallel, such that the subsets can then be combined (in this case 
members of the preselected repertoire will bind at least one of the generic ligands) or 
separately followed by incorporation into the same polypeptide chain whereby a large 
fiincnonal library in which all members may be able to bmd all the generic ligands used 
dunng preselection. For example, subsets can be selected from one or more libraries 
usmg different generic ligands which bind heavy and light chains of antibody molecules 
(see below) and then combined to form a heavy/light chain library, in which the heavy 
and light chains are either non-covalenUy associated or are covalenUy linked for 
example, by using V„ and Vl domams in a single-Cham Fv context. 

Sggondary polypeptide sHerrion lining t}, ^ r^ rf f t lir p nd 

Following the selection step with the generic ligand. the library is screened in 
order to identify members that bind to the target ligand. Since it is enriched for 
functional polypeptides after selection with the generic ligand. there will be an 
advantageous reduction in non-specific ("background") binding during selection with 
the target ligand. Furthermore, since selection with the generic ligand produces a the 
n^ked reduction in the acmal library size (and a corresponding increase in the quality 
of the library) without reducing the functional library size, a smaUer repertoire should 
ehcit the same diversity of target ligand specifities and affinities as the larger starting 
repertoire (that contained many non-fimctional and poorly folded/expressed members). 

One or more target ligands may be used to select polypeptides from the first selected 
polypepude pool generated using the generic ligand. In the event that two or more 
target ligands are used to generate a nmnber of differem subsets, two or more of these 
subsets may be combined to form a single, more complex subset. A single generic 
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ligand is able to bind every member of the resulting combined subset; however, a given 
target ligand binds only a subset of library members. 

b. Selection prnreHiirP 9- 

.5 

Initial selection of renertoire member s with the target li gand 

Here, selection using the target ligand is performed prior to selection using the 
generic ligand. Obviously, the same set of polypeptides can result from either scheme, 
if such a result is desired. Using this approach, selection with multiple target ligands 
10 can be performed in parallel or by mixing the target ligands for selection. If performed 
in parallel, the resulting subsets may. if required, be combined. 

Secondarv polvpepride selection R ising the generic ligand 

Subsequent selection of the target ligand binding subset can then be performed using 
15 one or more generic ligands. Whilst this is not a selection for fimction, since members 
of the repertoire that are able to bind to the target ligand are by definition functional, it 
does enable subsets that bind to different generic ligands to be isolated. Thus, the target 
ligand selected population can be selected by one generic ligand or by two or more 
generic ligands. In this case, the generic ligands can be used one after another to create 
20 a repertoire in which all members bind the target ligand and two or more generic 
ligands or separately in parallel, such that different (but possibly overlapping) subsets 
binding the target ligand and different generic ligands are created. These can then be 
combined (m this case, members will bind at least one of the generic ligands). 

25 Selection of immunoplobulin-familv pnlvp ep^iH e library members 

The members of the repertoires or libraries selected in the present invention 
advantageously belong to the immunoglobulin superfamily of molecules, in particular, 
antibody polypeptides or T-cell receptor polypeptides. For antibodies, it is envisaged 
that the method according to this invention may be applied to any of the existing 

30 antibody libraries known in the art (whether namral or synthetic) or to antibody 
libraries designed specifically to be preselected with generic ligands (see below). 

Construction of libraries of the invention 

35 a. Selection of the main-chain rnnfnryi^atj^n 

The members of the immunoglobulm superfamily all share a similar fold for 
their polypeptide chain. For example, although antibodies are highly diverse in terms of 
their primary sequence, comparison of sequences and crystallographic structures has 



eNSDOCio. <wo 



g92074SA1 I > 



wo 99/20749 

PCT/GB98/03135 

18 

revealed that, contrary to expectation, five of the six antigen binding loops of antibodies 
(HI, H2. LI. L2, 13) adopt a lunited number of main-chain conformations, or 
canonical strucnires (Chothia and Lesk (1987) supra; Chothia et al (1989) supra). 
Analysis of loop lengths and key residues has therefore enabled prediction of the main- 
chain conformations of Hl-, H2, LI. L2 and U found in the majority of human 
antibodies (Chothia et al. (1992) supra; Tomlinson et al. (1995) supra; WUliams et al. 
(1996) supra). Although the H3 region, is much more diverse in tenns of sequence, 
length and strucmre (due to the use of D segments), it also forms a limited number of 
main-chain conformations for short loop lengths which depend on the length and the 
presence of particular residues, or types of residue, at key positions in the loop and the 
antibody framework (Martin et al. (1996) supra; Shirai et al. (1996) supra). 

According to the present invention, libraries of antibody polypeptides are designed m 
which certain loop lengths and key residues have been chosen to ensure that the main- 
chain conformation of the members is known. Advantageously, these are real 
conformations of immunoglobulin superfamily molecules found in nature, to minimize 
the chances that they are non-fimctional. as discussed above. Germline V gene segments 
serve as one suitable basic framework for constructing antibody or T-cell receptor 
libraries; other sequences are also of use. Variations may occur at a low frequency, 
such that a small number of functional members may possess an altered main-chain 
conformation, which does not affect its function. 

Canonical strucmre theory is also of use in the invention to assess the number of 
different main-chain conformations encoded by antibodies, to predict the main-chain 
conformation based on antibody sequences and to chose residues for diversification 
which do not affect the canonical structure. It is how known that, in the human 
domain, the LI loop can adopt one of four canonical strucmres. the L2 loop has a 
single canonical structure and that 90% of human V, domains adopt one of four or five 
canonical structures for the L3 loop (Tomlinson et al. (1995) supra); thus, in the V, 
domain alone, different canonical structures can combine to create a range of different 
main-chain conformations. Given diat the V;, domain encodes a difierent range of 
canonical strucmres for the LI. L2 and L3 loops and that V, and domains can pair 
wifli any Vh domain which can encode several canonical strucnires for the HI and H2 
loops, the number of canonical structure combinations observed for these five loops is 
very large. This implies that the generation of diversity in the main-chain conformation 
may be essential for the production of a wide range of binding specificities. However, 
by constructing an antibody library based on a single known main-chain conformation it 
was found, contrary to expectation, that diversity in the main-chain conformation is not 
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required to generate sufficient diversity to target substantially all antigens. Even more 
surprisingly, the single main-chain conformation need not be a consensus structure - a 
single naturally occurring conformation can be used as the basis for an entire library. 
Thus, in a preferred aspect, the invention provides a library in which the members 
5 encode a single known main-chain conformation. It is to be understood, however, that 
occasional variations may occur such that a small nxunber of functional members may 
possess an alternative main-chain conformation, which may be unknown. 

The single main-chain conformation that is chosen is preferably commonplace among 
10 molecules of the immunoglobulin superfamily type in question. A conformation is 
commonplace when a significant number of naturally occurring molecules are observed 
to adopt it. Accordingly, in a preferred aspect of the invention, the natural occurrence 
of the different main-chain conformations for each binding loop of an inmiimoglobulin 
superfamily molecule are considered separately and then a naturally occurring 
15 inununoglobulin superfamily molecule is chosen which possesses the desired 
combination of main-chain conformations for the different loops. If none is available, 
the nearest equivalent may be chosen. Since a disadvantage of immunoglobulin-family 
polypeptide libraries of the; prior art is that many members have unnatural framewoirks 
or contain framework mutations (see above), in the case of antibodies or T-cell 
20 receptors, it is preferable that the desired combination of main-chain conformations for 
the different loops is created by selecting gemiline gene segments which encode the 
desired main-chain conformations. It is niore preferable, that the selected germline gene 
segments are frequently expressed and most preferable that they are the most frequently 
expressed. 

25 

In designing antibody libraries, therefore, the incidence of the different main-chain 
conformations for each of the six antigen binding loops may be considered separately. 
For HI, H2, LI, L2 and L3, a given conformation that is adopted by between 20% and 
100% of the antigen binding loops of naturally occurring molecules is chosen. 

30 Typically, its observed incidence is above 35% (i.e. between 35% and 100%) and, 
ideally, above 50% or even above 65%. Since the vast majority of H3 loops do not 
have canonical structures, it is preferable to select a main-chain conformation which is 
commonplace among those loops which do display canonical strucmres. For each of the 
loops, the conformation which is observed most often in the namral repertoire is 

35 therefore selected. In human antibodies, the most popular canonical structures (CS) for 
each loop are as follows: HI - CS 1 (79% of the expressed repenoire), H2 - CS 3 
(46%), LI - CS 2 of (39%), L2 - CS 1 (100%), L3 - CS 1 of (36%) (calculation 
assumes a k:?l ratio of 70:30, Hood et al (1967) CoM Spring Harbor Symp. Quant. 



BNSDOCID: <WO 



9920749A1 I > 



10 



wo 99/20749 

PCT/GB98/0313S 

20 

Tnomx ^^''^ structures, a CDR3 length (Kabat e/ 

a/. (1991) Sequences of proteins of immunological interest, U.S. Department of Health 
and Human Services) of seven residues with a salt-bridge from residue 94 to residue 

th P^^^^'^ '^'^^ '^'^^ ^^ibody sequences in 

the EMBL data library with the required H3 length and key residues to form this 
conformation and at least two crystallographic structures in the protein data bank which 
can be med as a basis for antibody modelling (2cgr and Itet). The most frequently 
expressed germlme gene segments that this combination of canonical sirucmres are the 
V„ segment 3-23 (DP-47). the J„ segmem JH4b. the V, segment 02/012 (DPK9) and 
the J. segment J.l. These segments can therefore be used in combination as a basis to 
construct a library widi the desired single main-chain conformation 



Alternatively, instead of choosing the single main-chain conformation based on the 
namral occurrence of the different main-chain conformations for each of the binding 
15 loops m isolation, the natural occurrence of combinations of main-chain conformations 
IS .«ed as Ae basis for choosing the single main-cham conformation. In the case of 
anubodtes. for example, the namral occurrence of canonical strucmre combinations for 
two, Uxree, four, five or for all six of the antigen binding loops can be determmed 
20 Zf^'" conformation is commonplace in naturally 

20 occurrmg antibodies and most preferable that it observed most frequently in the namral 
repertoire. Thus, in human antibodies, for example, when natural combinations of the 
five antigen binding loops. Hi. H2. LI. L2 and L3, are considered, the most frequent 
combination of canonical strucmres is determined and then combined with the most 

23 LToi:::" " - - ^^^^ - — n 

b. Pivmifirarion of ih.^ --ainiiriil smiinirr 

Having select several known nain-chata conformations or, preferably a single 
30 ^^"^-^ conformation, the library of the invention is constructed by varying 
Ure bmdtng s«e of it. nK,lecule in order to generate a repertoire v^th stmcmral ^,or 
tocuonal dtversity. This means that variants are generated such that they possess 
sufHctent dtverstty in Uteir structure and/or in their function so that they are capable of 

ceTsurfa ' °' ^" ^"^^ ^'''^•'^ <^ - 

The desired diversity is t^-ically generated by varying the selected molecule at one or 
more posmons. The p^itions to be changed can be chosen at random or are preferably 
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selected. The variation can then be achieved either by randomization, during which the 
resident amino acid is replaced by any amino acid or analogue thereof, natural or 
synthetic, producing a very large number of variants or by replacing the resident amino 
acid with one or more of a defined subset of amino acids, producing a more limited 
5 number of variants. 

Various methods have been reported for introducing such diversity. Error-prone PGR 
(Hawkins aL (1992) 7. MoL Biol., 226: 889), chemical mutagenesis (Deng et al. 
(1994) 7. Biol. Chem., 269: 9533) or bacterial mutator strains (Low et aL (1996) J. 

10 Mol. BioL, 260: 359) can be used to introduce random mutations into the genes that 
encode the molecule. Methods for mutating selected positions are also well known in 
the art and include the use of mismatched oligonucleotides or degenerate 
oligonucleotides, with or without the use of PGR. For example, several synthetic 
antibody libraries have been created by targeting mutations to the antigen binding 

15 loops. The H3 region of a human tetanus toxoid-binding Fab has been randomized to 
create a range of new binding specificities (Barbas et aL (1992) supra). Random or 
semi-random H3 and L3 regions have been appended to germline V gene segments to 
produce large libraries with unmutated framework regions (Hoogenboom and Winter 
(1992) supra; Nissim et aL (1994) supra; Griffiths et aL (1994) supra; De Kruif et aL 

20 (1995) supra). Such diversification has been extended to include some or all of the 
other antigen bmding loops (Grameri et aL (1996) Nature Med ^ 2: 100; Riechmann et 
aL (1995) Bio/Technology, 13: 475; Morphosys, WO97/08320, supra). 

Since loop randomization has the potential to create approximately more than 10^^ 
25 structures for H3 alone and a similarly large number of variants for the other five 
loops, it is not feasible using current transformation technology or even by using cell 
free systems to produce ajibrary representing all possible combinations. For example, 
in one of the largest libraries constructed to date, 6 x 10^° different antibodies, which is 
only a fraction of the potential diversity for a library of this design, were generated 
30 (Griffiths et aL (1994) supra). 

In addition to the removal of non-functional members and the use of a single known 
main-chain conformation, the present invention addresses these limitations by 
diversifying only those residues which are directly involved in creating or modifying 
35 the desired function of the molecule. For many molecules, the function will be to bind a 
target ligand and therefore diversity should be concentrated in the target ligand binding 
site, while avoiding changing residues which are crucial to the overall packing of the 
molecule or to maintaining the chosen main-chain conformation; therefore, the 
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invention provides a library wherein the selected positions to be varied may be those 
that constitute the binding site for the target ligand. 

PlVgrsifipaTion of thr r fi nnnical s e ouenrp as it ap pliP^ to anrihndi>>«f 

In the case of an antibody library, the binding site for the target ligand is most 
often the antigen binding site. Thus, in a highly preferred aspect, the invention provides 
an anubody library in which only those residues in the antigen binding site are varied 
These residues are extremely diverse in the human antibody repertoire and are known 
to make contacts in high-resolution antibody/antigen complexes. For example in L2 it 
IS known that positions 50 and 53 are diverse in naturally occurring antibodies and are 
observed to make contact with the antigen. In contrast, the conventional approach 
would have been to diversify all the residues in the corresponding Complementarity 
Determining Region (CDRl) as defined by Kabat et al. (1991, supra), some seven 
residues compared to the two diversified in the library according to the invention This 
represents a significant improvement in terms of the functional diversity required to 
create a range of antigen binding specificities. 

In namre. antibody diversity is the resuh of two processes: somatic recombination of 
germlme V, D and J gene segments to create a naive primary repertoire (so called 
germlme and junctional diversity) and somatic hypermutation of the resulting 
rearranged V genes. Analysis of human antibody sequences has shown diat diversity in 
the prmiary repertoire is focused at the centre of the antigen binding site whereas 
somatic hypermutation spreads diversity to regions at the periphery of the antigen 
bmdmg site that are highly conserved in the primary repertoire (see Tomlinson et al 
(1996) supra). This complementarity has probably evolved as an efficient strategy for 
searching sequence space and, although apparently unique to antibodies, it can easily be 
applied to other polypeptide repertoires according to the invention. According to the 
invention, the residues which are varied are a subset of those that form the binding site 
for the target ligand. Different (including overlapping) subsets of residues in the target 
hgand bmdmg site are diversified at different stages during selection, if desired. 

In the case of an antibody repenoire. the two-step process of the invention is analogous 
to the mamration of antibodies in the human immune system. An initial 'naive* 
repertoire is created where some, but not all. of the residues in the antigen binding site 
are diversified. As used herein in this context, the term "naive" refers to antibody 
molecules that have no pre-determined target ligand. These molecules resemble those 
which are encoded by the immmioglobulin genes of an individual who has not 
undergone mmiune diversification, as is the case with fetal and newborn individuals 
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whose immune systems have not yet been challenged by a wide variety of antigenic 
stimuli. This repertoire is then selected against a range of antigens. If required, further 
diversity can then be introduced outside the region diversified in the initial repertoire. 
This matured repertoire can be selected for modified function, specificity or affinity. 

5 . . - 

The invention provides two different naive repertoires of antibodies in which some or 
all of the residues in the antigen binding site are varied. The "primary" library mimics 
the natural prinaary repertoire, with diversity restricted to residues at the centre of the 
antigen binding site that are diverse in the germline V gene segments (germline 

10 diversity) or diversified during the recombination process (junctional diversity). Those 
residues which are diversified include, but are not limited to, H50, H52, H52a, H53, 
H55, H56, H58, H95, H96, H97, H98, L50, L53, L91, L92, L93, L94 and L96, In 
the "somatic" library, diversity is restricted to residues that are diversified during the 
recombination process (jtinctional diversity) or are highly somatically mutated). Those 

15 residues which are diversified include, but are not limited to: H31, H33, H35, H95, 
H96, H97, H98, L30, L31, L32, L34 and L96. AH the residues listed above as suitable 
for diversification in these libraries are known to make contacts in one or more 
antibody-antigen complexes. Since in both libraries, not all of the residues in the 
antigen binding site are varied, additional diversity is incorporated during selection by 

20 varying the remaining residues, if it is desired to do so. It shall be apparent to one 
skilled in the art that any subset of any of these residues (or additional residues which 
comprise the antigen binding site) can be used for the initial and/or subsequent 
diversification of the antigen binding site. 

25 In the construction of libraries according to the invention, diversification of chosen 
positions is typically achieved at the nucleic acid level, by altering the coding sequence 
which specifies the sequence of the polypeptide such that a number of possible amino 
acids (all 20 or a subset thereof) can be incorporated at that position. Using the lUPAC 
nomenclature, the most versatile codon is NNK, which encodes all amino acids as well 

30 as the TAG stop codon. The NNK codon is preferably used in order to introduce the 
required diversity. Other codons which achieve the same ends are also of use, including 
the NNN codon, which leads to the production of the additional stop codons TGA and 
TAA. 

35 A feamre of side-chain diversity in the antigen binding site of human antibodies is a 
pronounced bias which favors certain amino acid residues. If the amino acid 
composition of the ten most diverse positions in each of the V^, and regions are 
summed, more than 76% of the side-chain diversity comes from only seven different 
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residues, these being, serine (24%), tyrosine (14%), asparagine (11%), glycine (9%) 
alanine (7%). aspartate (6%) and threonine (6%). This bias towards hydrophilic 
residues and small residues which can provide main^hain flexibility probably reflects 
the evolution of surfaces which are predisposed to binding a wide range of antigens and 
may help to explain the required promiscuity of antibodies in the primary repertoire. 

Since it is preferable to mimic this distribution of amino acids, the invention provides a 
library wherein tiie distribution of amino acids at the positions to be varied munics that 
seen m the antigen binding site of antibodies. Such bias in tiie substimtion of amino 
acids that permits selection of certain polypeptides (not just antibody polypeptides) 
against a range of target ligands is easily applied to any polypeptide repertoire 
according to tiie invention. There are various methods for biasing die amino acid 
distribution at tiie position to be varied (including tiie use of tri-nucleotide mutagenesis 
WO97/08320, Morphosys. supra), of which tiie preferred metiiod. due to ease of 
syntiiesis. is the use of conventional degenerate codons. By comparing the amino acid 
profile encoded by all combinations of degenerate codons (wifli single, double, triple 
and quadruple degeneracy in equal ratios at each position) wifl, die natural amino acid 
use It IS possible to calculate the most representative codon. The codons 
(AGT)(AGC)T, (AGT)(AGC)C and (AGT)(AGC)(CT) - tiiat is. DVT. DVC and DVY 
respectively using lUPAC nomenclature - are tiiose closest to flie desired amino acid 
profile: tixey encode 22% serine and 11% tyrosine, asparagine. glycine, alanine 
aspartate, direonine and cysteine. Preferably, tiierefore, libraries are constructed using 
eitiier tiie DVT. DVC or DVY codon at each of die diversified positions. 

As stated above, polypeptides which make up antibody libraries according to tiie 
mvention may be whole antibodies or fragments fliereof, such as Fab, F(ab')2. Fv or 
scFv fragments, or separate Vh or V^ domains, any of which is eitiier modified or 
umnodified. Of tiiese. single-chain Fv fragments, or scFvs, are of particular use ScFv 
fragments, as well as otiier antibody polypeptides, are reliably generated by antibody 
engmeermg metiiods weU known in tiie art. The scFv is formed by comiecting tiie V„ 
and Vl genes using an oligonucleotide that encodes an appropriately designed Imker 
peptide, such as (Gly-Gly-Gly-Gly-Ser)3 °r equivalent linker peptide(s). The linker 
bridges flie C-terminal end of tiie first V region and N-terminal end of tiie second V 
region, ordered as either Vn-linker-V^ or V^-lmker-VH. In principle, tiie binding site of 
tiie scFv can faitiifiilly reproduce the specificity of the correspondmg whole antibody 
and vice-versa. 
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Similar techniques for the construction of Fv; Fab and F(ab')2 fragments, as well as 
chimeric antibody molecules are well known in the art. When expressing Fv fragments, 
precautions should be taken to ensure correct chain folding and association. For Fab 
and F(ab*)2 fragments, and Vl polypeptides are combined with constant region 
5 segments, which may be isolated from rearranged genes, germline C genes or 
synthesised from antibody sequence data as for V region segments. A library according 
to the invention may be a Vh or library. Thus, separate libraries comprising single 
Vh and Vl domains may be constructed and, optionally, include Ch or Cl domains, 
respectively, creating Dab molecules. 

10 

c. Library vector svstems according to the invention 

Libraries according to the invention can be used for direct screening using the 
generic and/or target ligands or used in a selection protocol that involves a genetic 
display package. 

15 

Bacteriophage lambda expression systems may be screened directly as bacteriophage 
plaques or as colonies of lysogens, both as previously described (Huse et aL (1989J 
Science, 246: 1275; Gaton and Koprowski (1990) Proc. NatL Acad. Sci. U.S.A., 87; 
MuUinax et aL {1990) Proc. NatL Acad. Sci. U.S.A^, 87: 8095; Persson et aL (1991) 

20 Proc. NatL Acad. ScL U.S.A., 88: 2432) and are of use in the invention. Whilst such 
expression systems can be used to screening iip to 10^ different members of a library, 
they are not really suited to screening of larger nimibers (greater than 10^ members). 
Other screening systems rely, for example, on dkect chemical synthesis of library 
members. One early method involves the synthesis of peptides on a set of pins or rods, 

25 such as described in WO84/03564. A similar method involving peptide synthesis on 
beads, which forms a peptide library in which each bead is an individual library 
member, is described in U.S. Patent No. 4,631,211 and a related method is described 
in W092/(XK)91. A significant improvement of the bead-based methods involves 
tagging each bead with a unique identifier tag, such as an oligonucleotide, so as to 

30 facilitate identification of the amino acid sequence of each library member. These 
improved bead-based methods are described in WO93/06121. 

Another chemical synthesis method involves the synthesis of arrays of peptides (or 
peptidomimetics) on a surface in a manner that places each distinct library member 
35 (e.g., unique peptide sequence) at a discrete, predefined location in the array. The 
identity of each library member is determined by its spatial location in the array. The 
locations in the array where binding interactions between a predetermined molecule 
(e.g., a receptor) and reactive library members occur is determined, thereby identifying 
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the sequences of the reactive library members on the basis of spatial location These 
methods are described in U.S. Patent No. 5,143.854; WO90/15070 and WO92/10092- 
Fodor et al. (1991) Science, 251: 767; Dower and Fodor (1991) Ann. Rep Med 
Chem., 26: 271. 

Of particular use in the construction of libraries of the invention are selection display 
systems, which enable a nucleic acid to be linked to the polypeptide it expresses As 
used herem, a selection display system is a system that permits tiie selection by 
suitable display means, of the individual members of the library by binding the generic 
and/or target ligands. 

Any selection display system may be used in conjunction with a library according to the 
invention. Selection protocols for isolating desired members of large libraries are 
known in the art. as typified by phage display techniques. Such systems, in which 
diverse peptide sequences are displayed on the surface of filamentous bacteriophage 
(Scott and Smith (1990) supra), have proven useful for creating libraries of antibody 
fragments (and the nucleotide sequences that encoding them) for the in vitro selection 
and amplification of specific antibody fragments that bind a target antigen The 
nucleotide sequences encoding the V„ and V, regions are linked to gene fragments 
which enijode leader signals tiiat direct them to the periplasmic space of E. coli and as a 
result the resultant antibody fragments are displayed on tiie surface of the 
bacteriophage, typically as fusions to bacteriophage coat proteins (e.g., pDI or pVni) 
Alternatively, antibody fragments are displayed externally on lambda phage capsids 
(phagebodies). An advantage of phage-based display systems is that, because they are 
biological systems, selected library members can be amplified simply by growing the 
phage containing tiie selected library member in bacterial cells. Furthermore since the 
nucleotide sequence that encode the polypeptide library member is contained on a phage 
or phagemid vector, sequencing, expression and subsequent genetic manipulation is 
relatively straightforward. 

Methods for the construction of bacteriophage antibody display libraries and lambda 
phage expression libraries are well known in the art (McCafferty et al. (1990) supra- 
Kang et al. (1991) Proc. Natl. Acad. Set. U.S.A.. 88: 4363; Clackson et al. (199li 
Nature, 352: 624; Lowman et al. (1991) Biochemistry, 30: 10832; Burton et al (1991) 
Proc. Natl. Acad. Sci U.S.A., 88: 10134; Hoogenboom et al. (1991) Nucleic Acids 
Res.. 19: 4133; Chang et al. (1991) /. Imnumol., 147: 3610; Breifling et al (1991) 
Gene, 104: 147; Marks et al. (1991) supra; Barbas et al. (1992) supra; Hawkins and 
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Winter (1992) 7. Immunol,, 22: 867; Marks et aL, 1992, /. BioL Chem,, 267: 16007; 
Leraer et aL (1992) Science, 258: 1313, incorporated herein by reference). 

One particularly advantageous approach has been the use of scFv phage-libraries 
5 (Huston et aL, 1988, Proc. Natl. Acad. Sci U.S.A.. 85: 5879-5883; Chaudhary- er aL 

(1990) Proc. Nad. Acad. Sci U.S.A., 87: 1066-1070; McCafferty et aL (1990) supra; 

Clackson et aL (1991) supra; Marks et aL (1991) supra; Chiswell et aL (1992) Trends 

Biotech., 10: 80; Marks et aL (1992) supra). Various embodhnents of scFv libraries 

displayed on bacteriophage coat proteins have been described. Refinements of phage 
10 display approaches are also known, for example as described in WO96/06213 and 

WO92/01047 (Medical Research Council et aL) and WO97/08320 (Morphosys, supra), 

which are incorporated herein by reference. 



Other sySteras for generating libraries of polypeptides or nucleotides^ involve the use of 
15 cell-free enzymatic machinery for the in vitro synthesis of the library members. In one 
method, RNA molecules are selected by alternate rounds of selection against a target 
ligand and PGR amplification (Tuerk and Gold (1990) Science, 249: 505; Ellington and 
Szostak (1990) Nature, 346: 818). A similar technique may be used to identify DNA 
sequences which bind a predetermined human transcription factor (Thiesen and Bach 
20 (1990) Nucleic Acids Res., 18: 3203; Beaudry and Joyce (1992) Science, 257: 635; 
WO92/05258 and W092/ 14843). In a similar way, in vitro translation can be used to 
synthesise polypeptides as a method for generating large libraries. These methods 
which generally comprise stabilised polysome complexes, are described further in 
WO88/08453, WO90/05785, WO90/07003, WO91/02076, WO91/05058, and 
25 WO92/02536. Alternative display systems which are not phage-based, such as those 
disclosed in W095/22625 and W095/11922 (Affymax) use the polysomes to display 
polypeptides for selection. These and all the foregoing documents also are incorporated 
herein by reference. 

30 The invention accordingly provides a method for selecting a polypeptide having a 
desired generic and/or target ligand binding site from a repertoire of polypeptides, 
comprising the steps of: 

a) expressing a library according to the preceding aspects of the invention; 

b) contacting the polypeptides with the generic and/or target ligand and selecting 
35 those which bind the generic and/or target ligand; and 

c) optionally amplifying the selected polypeptide(s) which bind the generic 
and/or target ligand. 

d) optionally repeating steps a) - c). 
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Preferably, steps a)-d) are performed using a phage display system. 

Since the invention provides a library of polypeptides which have binding sites for both 
5 generic and target ligands the above selection method can be applied to a selection 
using either the generic ligand or the target ligand. Thus, the initial library can be 
selected using the generic ligand and then the target ligand or using the target ligand 
and then the generic ligand. The invention also provides for multiple selections using 
different generic ligands either in parallel or in series before or after selection with the 
10 target ligand. 

Preferably, the method according to the invention further comprises the steps of 
subjecting the selected polypeptide(s) to additional variation (as described herein) and 
repeating steps a) to d). 



15 



25 



Since the generic ligand, by its very nature, is able to bind all library members selected 
using the generic ligand. the method according to the invention further comprises the 
use of the generic ligand (or some conjugate thereof) to detect, immobilise, purify or 
immunoprecipitate any functional member or population of members from the library 
2Q_ (whether selected by binding the target ligand or not) 

Smce the invention provides a library in which the members have a known main-<:hain 
conformaUon the method according to the invention further comprises the production of 
a three-dimensional structural model of any functional member of the library (whether 
selected by binding the target ligand or not). Preferably, the building of such a model 
involves homology modelling and/or molecular replacement. A preliminary model of 
the main-chain conformation can be created by comparison of the polypeptide sequence 
to the sequence of a known three-dimensional structure, by secondary strucmre 
prediction or by screening strucniral libraries. Computational software, may also be 
used to predict the secondary strucnire of the polypeptide. In order to predict the 
conformations of the side-chains at the varied positions, a side-chain rotamer library 
may be employed. 

In general, the nucleic acid molecules and vector constructs required for the 
35 performance of the present invention are available in the art and may be constructed 
and manipulated as set forth in standard laboratory manuals, such as Sambrook et al. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, USA. 



30 
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The manipulation of nucleic acids in the present invention is typically carried out in 
recombinant vectors. As used herein, vector refers to a discrete element that is used to 
introduce heterologous DNA into cells for the expression and/or replication thereof. 
Methods by which to select or construct and, subsequently, use such vectors are well 
5 known to pne of moderate skill in the art. Numerous vectors are publicly available, 
including bacterial plasmids, bacteriophage, artificial chromosomes and episomal 
. vectors. Such vectors may be used for simple cloning and mutagenesis; alternatively, as 
is typical of vectors in which repertoire (or pre-repertoire) members of the invention 
are carried, a gene expression vector is employed. A vector of use according to the 

10 invention may be selected to accommodate a polypeptide coding sequence of a desired 
size, typically from 0.25 kilobase (kb) to 40 kb in length. A suitable host cell is 
transformed with the vector after in yiiro cloning manipulations. Each vector contains 
various functional components, which generally include a cloning (or "polylinker") 
site, an origin of replication and at least one selectable marker gene. If given vector is 

15 an expression vector, it additionally possesses one or more of the following: enhancer 
element, promoter, transcription termination and signal sequences, each positioned in 
the vicinity of the cloning site, such that they are operatively linked to the gene 
encoding a polypeptide repertoire member according to the invention. 

20 Both cloning and expression vectors generally contain nucleic acid sequences that 
enable Uie vector to replicate in one or more selected host cells. Typically in cloning 
vectors, this sequence is one that enables the vector to replicate independently of the 
host chromosomal DNA and includes origins of replication or autonomously replicating 
sequences. Such sequences are well known for a variety of bacteria, yeast and viruses. 

25 The origin of replication from the plasmid pBR322 is suitable for most Gram-negative 
bacteria, the 2 micron plasmid origin is suitable for yeast, and various viral origins 
(e.g. SV 40, adenovirus) are useful for cloning vectors in mammalian cells. Generally, 
the origin of replication is not needed for mammalian expression vectors unless these 
are used in mammalian cells able to replicate high levels of DNA, such as COS cells. 

30 

Advantageously, a cloning or expression vector may contain a selection gene also 
referred to as selectable marker. This gene encodes a protein necessary for the sm^^ival 
or growth of transformed host cells grown in a selective cultuije medium. Host cells not 
transformed with the vector containing the selection gene will therefore not survive in 
35 the culture medium. Typical selection genes encode proteins that confer resistance to 
antibiotics and other toxins, e.g. ampicillin, neomycin, methotrexate or tetracycline, 
complement auxotrophic deficiencies, or supply critical nutrients not available in the 
growth media. 
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the vector is able to replicate as a plasmid with no expression, produce large quantities 
of the polypeptide library member only or produce phage, some of which contain at 
least one copy of the polypeptide-pIII fusion on their surface. 

5 Construction of vectors according to the invention employs conventional ligation 
techniques. Isolated vectors or DNA fragments are cleaved, tailored, and religated in 
the form desir^ to generate the required vector. If desired, analysis to confirm that the 
correct sequences are present in the constructed vector can be performed in a known 
fashion. Suitable methods for constructing expression vectors, preparing in vitro 

10 transcripts, introducing DNA into host cells, and performing analyses for assessing 
expression and function are known to those skilled in the art. The presence of a gene 
sequence in a sample is detected, or its amplification and/or expression quantified by 
conventional methods, such as Southern or Northern analysis. Western blotting, dot 
blotting of DNA, RNA or protein, in situ hybridization, immunocytochemistry or 

15 sequence analysis of nucleic acid or protein molecules. Those skilled in the art will 
i-eadily envisage how these methods may be modified, if desired. 

Mutagenesis using the polymerase chain reaction (PCR\ 

Once a vector system is chosen and one or more nucleic acid sequences 

20 encoding polypeptides of lnterest are cloned into the library vector, one may generate 
diversity within the cloned molecules by undertaking mutagenesis prior to expression; 
alternatively, the encoded proteins may be expressed ijid selected, as described above, 
before mutagenesis and additional rounds of selection are performed. As stated above, 
mutagenesis of nucleic acid sequences encoding strucmrally optimized polypeptides, is 

25 carried out by standard molecular methods. Of particular use is the polymerase chain 
reaction, or PGR, (MuUis and Faloona (1987) Methods EnzymoL, 155: 335, herein 
incorporated by reference). PGR, which uses multiple cycles of DNA replication 
catalyzed by a thermostable, DNA-dependent DNA polymerase to amplify the target 
sequence of interest, is well known in the art. 

30 

Oligonucleotide primers usefiil according to the invention are single-stranded DNA or 
RNA molecules that hybridize to a nucleic acid template to prime enzymatic synthesis 
of a second nucleic acid strand. The primer is complementary to a portion of a target 
molecule present in a pool of nucleic acid molecules used in the preparation of sets of 
35 arrays of the invention. It is contemplated that such a molecule is prepared by synthetic 
methods, either chemical or enzymatic. Alternatively, such a molecule or a fragment 
thereof is naturally occurring, and is isolated from its namral source or purchased from 
a commercial supplier. Mutagenic oligonucleotide primers are 15 to 100 nucleotides in 



B(^CXXI0:<WO. 



99a0748A1 I > 



10 



15 



20 



25 



30 



35 



WO 99/20749 

PCT/GB98/03135 

32 

length, ideally from 20 to 40 nucleotides, although oligonucleotides of different length 

are of use. ^ 

Typically, selective hybridization occurs when two nucleic acid sequences are 
substanually complementary (at least about 65% complementary over a stretch of at 
14 to 25 nucleotides, preferably at least about 75%, more preferably at least about 
90% complementary). See Kanehisa (1984) Nucleic Acids Res. 12: 203. incorporated 
herem by reference. As a resuh. it is expected that a certain degree of mismatch at the 
prmmig site is tolerated. Such mismatch may be small, such as a mono-, di- or tri- 
nucleotide. Alternatively, it may comprise nucleotide loops, which we define as regions 
m which mismatch encompasses an uninterrupted series of four or more nucleotides. 

Overall, five factors influence the efficiency and selectivity of hybridization of the 
pruner to a second nucleic acid molecule. These factors, which are (i) primer length, 
(u) the nucleoude sequence and/or composition, (iii) hybridization temperamre (iv) 
buffer chemistry and (v) the potential for steric hindrance in the region to Which the 
pnmer is required to hybridize, are important considerations when non-random priming 
sequences are designed. ^ s 

There is a positive correlation between primer length and both the efficiency and 
accuracy with which a primer will anneal to a target sequence; longer sequences have a 
higher melting temperamre (Tm) than do shorter ones, and are less likely to be r^eated 
withm a given target sequence, thereby minimizing promiscuous hybridization. Primer 
sequences with a high G-C content or that comprise palindromic sequences tend to self- 
hybridize. as do their intended target sites, since unimolecular. rather than bimolecular 
hybridization kinetics are genererally favored in solution; at the same time it is 
miportant to design a primer containing sufficient nmnbers of G-C nucleotide p^ings 
to bmd the target sequence tighfly. since each such pair is bomid by three hydrogen 
bonds, rather than the two that are found when A and T bases pair. Hybridization 
temperature varies inversely with primer amiealing efficiency, as does the concentration 
of organic solvents, e.g. formamide, that might be included in a hybridization mixmre 
whde mcreases in salt concentration facilitate binding. Under stringent hybridization 
conditions, longer prob^ hybridize more efficienUy than do shorter ones, which are 
sufficient under more permissive conditions. Stringent hybridization conditions 
^.cally include salt concentrations of less than about IM, more usually less than about 
500 mM and preferably less than about 200 mM. Hybridization temperatures range 
from as ow as 0 C to greater than ITC, greater than about 30»C, and (most often) m 
excess of about 37OC. Longer fragments may require higher hybridization temperamres 
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for specific hybridization. As several factors affect the stringency of hybridization, the 
combination of parameters is more important than the absolute measure of any one 
alone. 

5 Primers are designed with these considerations in mind. While estimates of the relative 
merits of numerous sequences may be made mentally by one of skill in the art, 
computer programs have been designed to assist in the evaluation of these several 
parameters and the optimization of primer sequences. Examples of such programs are 
"PrimerSelect" of the DNAStar™ software package (DNAStar, Inc.; Madison, WI) 

10 and OLIGO 4.0 (National Biosciences, Inc.). Once designed, suitable oligonucleotides 
are prepared by a suitable method, e.g. the phosphoramidite method described by 
Beaucage and Camithers (1981) Tetrahedron Lett., 22: 1859) or the triestef method 
according to Matteucci and Caruthers (1981) 7. Am. Chem. Soc, 103: 3185, bodi 
incorporated herein by reference, or by other chemical methods using either a 

15 commercial automated oligonucleotide synthesizer or VLSIPS™ technology. 

PCR is performed using template DNA (at least Ifg; more usefully, 1-1000 ng) irid at 
least 25 pmol of oligonucleotide primers; it may be advantageous to use a larger 
amount of primer when the primer pool is heavily heterogeneous, as each sequence is 

20 represented by only a small fraction of the molecules of the pool, and amounts become 
limiting in the later amplification cycles. A typical reaction mixture includes: 2/xl of 
DNA, 25 pmol of oligonucleotide primer, 2.5 fil of lOX PCR buffer I (Perkin-Elmer, 
Foster City, CA), 0.4 /xl of 1.25 dNTP, 0.15 ^1 (or 2.5 units) of Taq DNA 
polymerase (Perkin Elmer, Foster City, CA) and deionized water to a total volume of 

25 25 /il. Mineral oil is overlaid and the PCR is performed using a programmable thermal 
cycler. 

The length and temperature of each step of a PCR cycle, as well as the number of 
cycles, is adjusted in accordance to the stringency requirements in effect. Annealing 

30 temperature and timing are determined both by the efficiency with which a primer is 
expected to anneal to a template and the degree of mismatch that is to be tolerated; 
obviously, when nucleic acid molecules are simultaneously amplified and mutagenized, 
mismatch is required, at least in the first round of synthesis. In attempting to amplify a 
population of molecules using a mixed pool of mutagenic primers, the loss, under 

35 stringent (high-temperature) annealing conditions, of potential mutant products that 
would only result from low melting temperatures is weighed against the promiscuous 
annealing of primers to sequences other than the target site. The ability to optimize the 
stringency of primer annealing conditions is well within the knowledge of one of 
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moderate skill in the an. An annealing temperature of between 30 C and 72 "C is used 
Imnal denaturation of the template molecules normally occurs at between n^C and 
99 C for 4 mmutes. followed by 20-10 cycles consisting of denaturation (94-99»C for 
15 seconds to 1 minute), annealing (temperature determined as discussed above- 1-2 

ZH' r r"'" '''""^ '^"^^^ ^-PHfied 

product). Fmal extension is generally for 4 minutes at 72°C. and may be followed by an 
mdeflnite (0-24 hour) step at 4°C. loiioweaoyan 

Structural analvsk Af r ^ pertnirp mpm^ ^rc 

cnnf ^ r'' " '"P'''^^' °' polypeptides of known main-chain 

conformation, a three-dimensional structural model of any member of the repenoire is 
^Uy generated. Typically, the building of such a model involves homology modelling 
an^or molecular replacement. A preliminary model of the main-chain conformation i! 
^ea ^ by comparison of the polypeptide sequence to a similar sequence of known 
three-dmiensional structure, by secondary structure prediction or by screening 

r " '"^'-'^ -^^^es. In order to 

p^ict the conformations of the side-chains at the varied positions, a side-chain rotamer 
library may be employed. 

Antibodies for use as lipands in nnivp ppriH^ '-^'-Tf-n 

A generic or target ligand to be used in the polypeptide selection according to 
the present mventxon may. itself, be an antibody. This is particularly tnie of generic 
ZT:Z bind to structural features that are substanUally conserved in ftmct o 
polypeptides to be selected for inclusion in repertoires of the invention If an 

Z,rr T"""' " ^'^-^ by Phage'display 

methodology (see above) or as follows: 

Either recombinant proteins or those derived from natural sources can be used to 
generate antibodies using standard techniques, well known to those in the field For 
example, the protein (or "immunogen") is administered to challenge a mammal such as 

polyclonal sera, or antibody-producing cells from the challenged animal can be 
mmiortalized (e.g. by fusion with an immortalizing fusion parmer to produce a 
hybndoma). which cells then produce monoclonal antibodies. 

a. Polvclnnal aiiTihy^i^^ 

The antigen protein is either used alone or conjugated to a conventional carrier 
m order to increases its immunogenicity. and an antiserum to the peptide-carrier 
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conjugate is raised in an animal, as described above. Coupling of a peptide to a carrier 
protein and immunizations may be performed as described (Dymecki et al. (1992) J. 
Biol. Chem., 267: 4815). The serum is titered against protein antigen by ELISA or 
alternatively by dot or spot blotting (Boersma and Van Leeuwen (1994) J, NeuroscL 
5 Methods, 51: 317). The serum is shown to react strongly with the appropriate peptides 
by ELISA, for example, following the procedures of Green et al. (1982) Ce/Z, 28: 477. 

b. Monoclonal antibodies 

Techniques for preparing monoclonal antibodies are well known, and 

10 monoclonal antibodies may be prepared using any candidate antigen, preferably bound 
to a carrier, as described by Amheiter et al. (1981) Nature, 294, 278. Monoclonal 
antibodies are typically obtained from hybridoma tissue cultures or from ascites fluid 
obtained from animals into which the hybridoma tissue was introduced. Nevertheless, 
monoclonal antibodies may be described as being "raised against'* or "induced by" a 

15 protein. 

After being raised, monoclonal antibodies are tested for function and specificity by any 
of a nimiber of means. Similar procedures can also be lised to test recombinant 
antibodies produced by phage display or other in vitro selection technologies. 

20 Monoclonal antibody-producing hybridomas (or polyclonal sera) can be screened for 
antibody binding to the inununogen, as well. Particularly preferred immunological tests 
include enzyme-linked immunoassays (ELISA), immimoblotting and 
immunoprecipitation (see VoUer, (1978) Diagnostic Horizons, 2: 1, Microbiological 
Associates Quarterly Publication, Walkersville, MD; Voller et al. (1978) 7, Clin. 

25 PathoL, 31: 507; U.S. Reissue Pat. No. 31,006; UK Patent 2,019,408; Butler (1981) 
Methods EnzymoL, 73: 482; Maggio, E, (ed.), (1980) Enzyme Immunoassay, CRC 
Press, Boca Raton, FL) or radioimmunoassays (RIA) (Weintraub, B., Principles of 
radioinununoassays. Seventh Training Course on Radioligand Assay Techniques, The 
Endocrine Society, March 1986, pp. 1-5, 46-49 and 68-78), all to detect binding of the 

30 antibody to the immunogen against which it was raised. It will be apparent to one 
skilled in the art that either the antibody molecule or the immunogen must be labeled to 
facilitate such detection. Techniques for labeling antibody molecules are well known to 
those skilled in the art (see Harlour and Lane (1989j Antibodies, Cold Spring Harbor 
Laboratory, pp. 1-726). 

35 

Alternatively, other techniques can be used to detect binding to the immunogen, thereby 
confirming the integrity of the antibody which is to serve either as a generic antigen or 
a target antigen according to the invention. These include chromatographic methods 
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such as SDS PAGE, isoelectric focusing. Western blotting. HPLC and capUIary 
electrophoresis. ^ ' 

"Andbodies- «e defined herem as constructions nsing rte binding (vambte) region of 
snch anubodies. and other antibody .hodificadons. Thus, an antibody useital in the 
mvendon nuty con^rise whole antibodies, antibody fragn>ents. polyfunctional antibody 
aggregates, or in general any substance comprising one or more specific btading si.« 
from an antibody. The antibody fragments .nay be fragments such as Fv. and 
F(ab), fragments or any derivatives ti>ereof, such as a single chain Fv fragments. Ute 
10 -tibodtes or antibody fragments may be non-recombinant, recombinant or humamzed 
ne amibody may be of any immunoglobulin isolype. e.g.. IgG. IgM, and so forti,. In 
addition, aggregates, polymers, derivatives and conjugates of immunoglobulins or dteir 
fragments can be used where appropriate. 

McBllir iom ns lipand-i for the ■jrieni^n p oivp.^ri^ „ 

20 ""^ <" to 1" ''K^ion of 

^lyp^tKies according to the invention. One such category Of ligand is that of metallic 

«■ ' presence of a 

fcnctional htstidme (HIS) tag using a Ni-NTA matrix. Immobilized metal affinity 
Chromatography (IMAC; Hubert and Porath (1980; J. CHronm.srupHy 9S 247) t^ 

as well as others Uaat may bind metals, on ti,e exposed surfaces of numerous proteins It 
employs a resin, typically agarose, comprising a bidentate metal chelator (eg 
mmioAacettc acid, IDA. a dicarboxylic acid group) to which is complexed meuuifc 
'° ' according to the invention. 

T-^^T^Z^"""^- °" agarose^DA preparation 

B CHELATING SEPHAROSE 68" (Phannacia Fine Chemicals: Piscatawl NJ) 
M«au.c ton ti^ are^of use include, bu, are not limited to. tite divalent cations Ni". 
CU Zn and Co ■ A pool of polypeptide molecules is prepared in a binding buto 
Which consists essentially of salt (typically. NaCI or KQ) a. a 0.1- to 1 OM 

^^T?. " '"^ " — 'a«er Of which'has 

=^ty for ti^ metiUlic ions of tire resin, but to a lesser degree Uran does a polypeptide 
to be seleaed according to the invention. UseftU concentrations of ttte weaf ligand 
range from O.OI- to O.IM in die binding buffer. 
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The polypeptide- pool is contacted with the resin under conditions which permit 
polypeptides having metal-binding domains (see below) to bind; after impurities are 
washed away V the selected polypeptides Sffe eluted with a buffer in which the weak 
5 ligand is present in a higher concentration than in the binding buffer, specifically, at a 
. concenttation sufficient for the weak ligand to displace the selected polypeptides, 
despite its lower binding affinity for the metallic ions. Useful concentrations of the 
weak ligand in the elution buffer are 10- to 50-fold higher than in the binding buffer, 
typically from 0.1 to 0.3 M; note that the concentration of salt in the elution buffer 
10 equals that in the binding buffer. According to the methods of the present invention, the 
metallic ions of the resin typically serve as the generic ligand; however, it is 
contemplated that they may also be used as the target ligand. 

IMAC is carried out using a standard chromatography apparams (colunms, through 
15 which buffer is drawn by gravity, pulled by a vacuum or driven by pressure); 
alternatively, a large-batch procedure is employed, in which the metal-tearing resin is 
mixed^ in slurry form, ^yith the polypeptide pool from which members of a repertoire 
of the invention are to be selected. 



20 Partial purification of a serum T4 protein by IMAC has been described (Staples et al., 
U.S. Patent No. 5,169,936); however, the broad spectrum of proteins comprising 
surface-exposed metal-binding domains also encompasses other soluble T4 proteins, 
human serum proteins (e.g. IgG, haptoglobin, hemopexin, Gc-globulin, Clq, C3, C4), 
human desmoplasmin, Dolichos biflorus lectin, zinc-inhibited Tyr(P) phosphatases, 

25 phenolase, carboxypeptidase isoenzymes, Cu,Zn superoxide dismutases (including those 
of humans and all other eukaryotes), nucleoside diphosphatase, leukocyte interferon, 
lactoferrin, human plasma a2-SH glycoprotein, ^j-macroglobulin, a j -antitrypsin, 
plasminogen activator, gastrointestinal polypeptides, pepsin, human and bovine serum 
albumin, granule proteins from granulocytes, lysozymes, non-histone proteins, human 

30 fibrinogen, human serum transferrin, human lymphotoxin, calmodulin, protein A, 
avidin, myoglobins, somatomedins, human growth hormone, transforming growth 
factors, platelet-derived growth factor, a-human atrial natriuretic polypeptide, 
cardiodilatin and others. In addition, extracellular domain sequences of membrane- 
bound proteins may be purified using IMAC. Note that repertoires comprising any of 

35 the above proteins or metal-binding variants thereof may be produced according to the 
methods of the invention. 
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Following elation, selected polypeptides are removed from the metal binding buffer and 
Placed m a buffer appropriate to their next use. If the metaUic ion has been used to 
generate a first selected polypeptide pool according to the invention, the molecules of 
ftat pool are placed into a buffer that is optimized for binding with the second Ugand to 
bemused m selection of the members of the functional polypeptide repertoire. If the 
metal ,s mstead. used in the second selection step, the polypeptides of the repertoire 
are ^fe^ed to a buffer suitable either to storage (e.g. a 0.5% glycine buffer) or the 
use for which they are intended. Such buffers include, but are not limited to- water 
sTL"'""!' water-miscible organic solvents, physiological 

sah buffers and protein/nucleic acid or protein/protein binding buffers.^ Alternatively 
the polypeptide molecules may be dehydrated (i.e. by lyophilization) or immobilized on 
a ohd or semi-solid support, such as a nitrocellulose or nylon filtration membrane or a 
gel matnx (i.e. of agarose or polyacrylamide) or crosslinked to a chromatography resin. 

'''' ^'"^'"^ ^"ff- of - n-nber of 

me^ods laiown m the art. The polypeptide eluate may be dialyzed against water or 
another solution of choice; if the polypeptides are to be lyophilized. water to which has 
been added protease inhibitors (e.g. pepstatm, aprotinin. leupeptin. or others) is used 
Altei^atively. the sample may be subjected to ammonium sulfate precipitation, which is 
well known m tiie art. prior to resuspension in the medium of choice. 

Use Of polvpfprides selerred arrnrdiny m jn y^n f ir n 

Polypeptides selected according to the method of the presem invention may be 

nld " "'■"'"^ ligand-polypeptide binding, 

mcludmg v.vo therapeutic and prophylactic applications. /„ yirro and /„ vL 
d^gnostic applications, in vitro assay and reagent applications, and the like. For 
e^p e. m the case of antibodies, antibody molecules may be used in antibody based 
assay techniques, such as ELISA techniques, according to metiiods known to those 
siuiiea in the art. 

AS alluded K> above. U>e molecules selected acconling ,0 .he invention are of use in 
<l«gnoa,c. prophylactic and therapeutic procedures. For example, enzyme variants 
generated and selected by these methods may be assayed for activity, either yi,ro or 
m «v» ustng techniques well known in the art, by which they are incubated with 
ca^td^e substrate molecules and the conversion of substrate to product is analyzed. 
Sdected cell-surface receptors or adhesion molecules might be expres«d in cultured 
cdls wtach are ti«n tested fbr their ability to respond to biochemical stimuli or for their 
affimty wtth ofl^r cell types Utat express cell-surfece molecules to which die 
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undiversified adhesion molecule would be expected to bind, respectively. Antibody 
polypeptides selected according to the invention are of use diagnostically in Western 
analysis and in situ protein detection by standard immunohistochemical procedures; for 
use in these applications, the antibodies of a selected repertoire may be labelled in 
5 accordance with techniques known to the art. In Edition, such antibody polypeptides 
may be used preparaiively in affinity chromatography procedures, when complexed to a 
chromatographic support, such as a resin. All such techniques are well known to one of 
skill in the art.. 

10 Therapeutic and prophylactic uses of proteins prepared according to the invention 
involve the administration of polypeptides selected according to the invention to a 
recipient mammal, such as a human. Of particular use in this regiard are antibodies^ 
other receptors (including, but not limited to T-cell receptors) and in the case in which 
an antibody or receptor was used as either a generic or target ligand, proteins which 

15 bind to them. 

Substantially pure antibodies or binding proteins thereof of at least 90 to 95% 
homogeneity are preiFerred for administration to a mammal, and 98 to 9^% or more 
homogeneity is most preferred for pharmaceutical uses, especially when the mammal is 
20 a human. Once purified, partially or to homogeneity as desired, the selected 
polypeptides may be used diagnostically or therapeutically (including extracorpore?dly) 
or in developing and performing assay procedures, immunofluorescent stainings and the 
like (Lefkovite and Pemis, (1979 and 1981) Immunological Methods, Volumes I and II, 
Academic Press, NY). 

25 

The selected antibodies or binding proteins thereof of the present invention will 
typically find use in preventing, suppressing or treating inflammatory states, allergic 
hypersensitivity, cancer, bacterial or viral infection, and autoimmune disorders (which 
include, but are not limited to. Type I diabetes, multiple sclerosis, rheumatoid arthritis, 
30 systemic lupus erythematosus, Crohn's disease and myasthenia gravis). 

In the instant application, the term "prevention" involves administration of the 
protective composition prior to the induction of the disease. "Suppression** refers to 
administration of the composition after an inductive event, but prior to the clinical 
35 appearanpe of the disease. "Treatment" involves administration of the protective 
composition after disease symptoms become manifest. 
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Annnal model systems which can be used to screen the effectiveness of the antibodies 
or bmdmg protems thereof in protecting against or treating the disease are available 
Methods for the testing of systemic lupus erythematosus (SLE) in susceptible mice are 

(1978) Ne^ Eng. J. Med., 299: 515). Myasthenia Gravis (MG) is tested in SJL/J 
female mice by inducing the disease with soluble AchR protein from another species 
(Lmdstrom a/ (1988) Ad.. Immunol., 42: 233). Arthritis is induced in a susc^ble 
tram of mice by injection of Type H coUagen (Smart al. (1984) Ann Rev 
In^munoi 42: 233). A model by which adjuvant arthritis is induced in susceptible rats' 
(LT^ -y-bacterial heat shock protein has been described (Van Eden er al. 

h .K?"' '''^''^ " induced in mice by administration of 

Uiyroglobulm as described (Maron et al. (1980) 7. Exp. Med., 152: 1115) Insulin 
dependent diabetes mellinis (IDDM) occurs namrally or can be induced in certain 

trains of mice such as those described by Kanasawa et al. (1984) Diabetologia, IT 
113. EAE m mouse and rat serves as a model for MS in human. In diis model, the 
demyelmatmg disease is induced by administration of myelin basic protem (see 
s:'::rN''v ~ Mlscher . ... eds..'Gnme ^ 

etal. (1987) /. ImmunoL, 138: 179). 

lUe selected antibodies, receptors (including, but not limited to T-cell receptors) or 
otrTnl'" ^'^'^ present invention may also be used in combmation with 

odier antibodies, particularly monoclonal antibodes (MAbs) reactive with other markers 
on human cells responsible for the diseases. For example, suitable T-cell markers can 
mclude those grouped into the so-called "Clusters of Differentiation." as named by the 
First Interaational Leukocyte Differentiation Workshop (Bemhard et al. (1984) 
Leukocyte Typing, Springer Verlag. NY). 

TT^:. T """"P^"" P-^-- -'i" be utilized 

m purified fonn together with pharmacologically appropriate carriers. Typically diese 

™ include aqueous or alcoholic/aqueous solutions, emulsions or suspensio^. any 
mcludmg salme and/or buffered media. Parenteral vehicles include sodium chi;ride 
solution. Rmgers dextrose, dextrose and sodium chloride and lactated Ringer's 
Suitable physiologically-acceptable adjuvants, if necessary to keep a polypeptide 
complex m suspension, may be chosen from thickeners such as carboxymethylcellulose 
polyvmylpyrrolidone. gelatin and alginates. 
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Intravenous vehicles include fluid and nutrient replenishers and electrolyte replenishers, 
such as those based on Ringer's dextrose. Preservatives and other additives, such as 
antunicrobials, antioxidants, chelating agents and inert gases, may also be present 
(Mack (1982) Remington 's Pharmaceuiical Sciences, 16th Edition). 

The selected polypeptides of the present invention may be used as separately 
administered compositions or in conjunction with other agents. These can include 
various inununotherapeutic drugs, isuch as cylcosporine, methotrexate, adriamycin or 
cisplatinum, and immunotoxins. Pharmaceutical compositions can include "cocktails" of 
10 various cytotoxic or other agents in conjunction with the selected antibodies, receptors 
or binding proteins thereof of the present invention, or even combinations of selected 
polypeptides according to the present invention having different specificities, sucli as 
polypeptides selected using different target ligands, whether or not they are pooled 
prior to administration. 

15 

The route of administration of pharmaeeutical compositions according to the invention 
may be any of those conunonly known to those of ordinary skill in the art. For therapy, 
including without limitation inmiunotherapy, the selected antibodies, receptors or 
binding proteins thereof of the invention can be administered to any patient in 

20 accordance with standard techniques. The administration can be by any appropriate 
mode, including parenterally , intravenously, intramuscularly, intraperitoneally , 
transdermally, via the pulmonary route, or also, appropriately, by direct infusion with a 
catheter. The dosage and frequency of administration will depend on the age, sex and 
condition of the patient, concurrent administration of other drugs, counterindications 

25 and other parameters to be taken into account by the clinician. 

The selected polypeptides of this invention can be lyophilized for storage and 
reconstimted in a suitable carrier prior to use. This technique has been shown to be 
effective with conventional immunoglobulins and art-known lyophilization and 
30 reconstitution techniques can be employed. It will be appreciated by those skilled in the 
art that lyophilization and reconstitution can lead to varying degrees of antibody activity 
loss (e.g. with conventional immunoglobulins, IgM antibodies tend to have greater 
activity loss than IgG antibodies) and that use levels may have to be adjusted upward to 
compensate. 

35 

The compositions containing the present selected polypeptides or a cocktail thereof can 
be administered for prophylactic and/or therapeutic treatments. In certain therapeutic 
applications, an adequate amount to accomplish at least partial inhibition, suppression. 
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cells ,s defined as a -Uierapeudcally-eHecive dose". Amount needed u, achieve this 
dosage wdl depend upon toe severiQ, of U,e disease and general sate of the patient's 
own mnnnne system, but generally range from 0.005 to 5.0 mg of selected antibody 
receptor (e.g. a T-ceU recqtor) or binding protein thereof per idlogram of bod^ 
we,^ „„h doses of 0.05 to 2.0 n«flcg/dose being more connnonly used Fo^ 
pn^hy acuc applicaUons. con-positions containing the present selected polypeptides or 
cocktaJs thereof .nay also be administered in similar or slighUy lower dosagX^ 

A c»^ition cotaaining a selected polypepUde according to the present invendon may 
m prophylactic and ttterapeuUc setdngs to aid in d.e alteration, inacdvaUon. 
Mmg or removal of a select target ceU populadon in a mammal. In additi™, the 
selected repertoires of polypeptides described herein may be used cxtracorporeally 'or ,„ 

from a heterogeneous collection of cells. B1«k1 from a mammal may be combined 
^3 T T receptors or binding proteins 

rt»^ to r T^"' " -oved from the blood for 

remm to the mammal m accordance with standard techniques. 

Example 1 

Antibody library design 



15 



25 



A. Main-chain conformation 



For five of the stxandgenbtading loops of human andbodies(U. Ul. L3. HI and H2) 

TcLT ' "^f ^-tfo™-*-. or ca«>nical strucmre^ 

« a,. (19^) J. Mot. Biol.. 227: 799; Tomlinson « al. (1995) BMBO 7., U: 

confonnatton accordmg to d,e invendon. Utese are: HI - CS 1 (79% of the expressed 

™ „';? 'T- " - " ^ " ■ ' 

,™! ■ . ^1 " " """"^ main-Chain conformations for short 

a^n I «3 has a CDR3 lengd, (as defined by Kabat « al. 

(1991). &^c« <,/pr«ei«s of immmoU,gical imeres,. U.S. Deparm^nt of Health 
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and Human Services) of seven residues and has a lysine or arginine residue at position 
H94 and an aspartate residue at position HlOl a salt-bridge is formed between these 
two residues and in most cases a single main-chain conformation is likely to be 
produced. There are at least 16 human antibody sequences in the EMBL data library 
5 with the required H3 length and key residues to form this conformation and at Hast two 
crystallographic stractures in the protein data bank which can be used as a basis for 
antibody modelling (^cgF an##te'9^ 

In this case, the most frequently expressed germline gene segments which encode the 
10 desired loop lengths and key residues to produce the required combinations of canonical 
structures are the Vh segment 3-23 (DP-47), the Jh segment JH4b, the Vk segment 
02/012 (DPK9) and the Jk segment 1^1. These segments can therefore be used in 
combination as a basis to construct a library with the desired single main-chain 
conformation. The Vk segment 02/012 (DPK9) is member of the VkI family and 
15 therefore will bind the superahtigen Protein L. The Vh segment 3-23 (DP-47) is a 
member of the Vh3 family and therefore should bind the superantigen Protein A, 
which can then be used as a generic ligand. ^ 

B. Selection of positions for variation 

20 ' _ - 

Analysis of human Vh and Vk sequences indicates that the most diverse positions in 
the mature repertoire are those that make the most contacts with antigens (see 
Tomlinson el aL^ (1996) J. MoL BioL, 256: 813; Figure 1). These positions form the 
functional antigen binding site and are therefore selected for side-chain diversification 

25 (Figure 2). H54 is a key residue and points away from the antigen binding site in the 
chosen H2 canonical structure 3 (the diversity seen at this position is due to canonical 
structures 1, 2 and 4 where H54 points into the binding site). In this case H55 (which 
points into the binding site) is diversified instead. The diversity at these poisitions is 
created either by germline or jimctional diversity in the primary repertoire or by 

30 somatic hypermutation (Tomlinson et aL, (1996) J, MoL BioL, 256: 813; Figure 1). 
Two different subsets of residues in the antigen binding site were therefore varied to 
create two different library formats. In the "primary" library the residues selected for 
variation are from H2, H3, L2 and L3 (diversity in these loops is naainly the result of 
germline or junctional diversity). The positions varied in this library are: H50, H52, 

35 H52a, H53, H55, H56, H58, H95, H96, H97, H98, L50, L53, L91, L92, L93, L94 
and L96 (18 residues in total. Figure 2). In the "somatic" library the residues selected 
for variation are from HI, H3» LI and the end of L3 (diversity here is mainly the result 
of somatic hypermutation or junctional diversity). The positions varied in this library 
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are: H31. H33, H35, H95. H96, H97. H98, L30. L31, L32. L34 and L96 (12 residues 
in total. Figure 2). 

C. Selection of amino acid use at the positions to be varied 

Side-chain diversity is introduced into the "primary" and "somatic" libraries by 
incorporating eiAer the codon NNK (which encodes all 20 amino acids/including the 
TAG stop codon, but not the TGA and TAA stop codons) or the codon DVT (which 
encodes 22% serine and 11% tyrosine, asparagine, glycine, alanine, aspartate 
threomne and cysteine and using single, double, triple and quadruple degeneracy in 
equal ratios at each position, most closely mimics the distribution of amino acid 
residues for m the antigen binding sites of natural human antibodies). 

Example 2 

15 Library construction and selection with the generic Hgands 

The "primary" and "somatic" libraries were assembled by PGR using the 
a9rrn" "'I— - ~ -^K9 (Gox era. 
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227. 7768). Briefly, first round of amplification was performed using pairs of 5' (back) 
prmiers m conjunction with NNK or DVT 3' (forward) primers together with the 
correspondmg germline V gene segmem as template (see Table 1). This produces eight 
separate DNA fragments for each of the NNK and DVT libraries. A second round of 
amplification was then performed using the 5' (back) primers and the 3' (forward) 
prmiers shown in Table 1 together with two of the purified fragments from the first 
round of amphfication. This produces four separate fragments for each of the NNK and 

^Ildc-TJ' ^ fr^Bment, 6A; a 

somatic Vh fragment, 5B; and a "somatic" V^ fragmem. 6B). 

Each of these fragments was cut and then ligated into pCLEANVH (for the Vh 
fragments) or pCLEANVK (for the Vk fragments) which contain dummy Vr and v" 
domams respectively in a version of pHENl which does not contain any TAG codo,^ 
or peptide tags (Hoogenboom & Winter (1992) 7. Mol Biol., 227: 381). The ligations 
were then electroporated into the non-suppressor E. Coli. strain HB2151. Phage from 
each of these libraries was produced and separately selected using immunotubes coated 
with 10 ^g/ml of the generic ligands Protein A and Protein L for the Vr and V^ 
libraries respectively. DNA from E. Coli. infected with selected phage was then 
prepared and cut so that the dummy V, inserts were replaced by the corresponding Vk 
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libraries. Electroporation of these libraries results in the following insert library sizes: 
9.21 X 108 ("primary" NNK), 5.57 x lO^ ("primary" DVT), 1.00 x 10^ ("somatic" 
NNK) and 2.38 x 10^ ("somatic" DVT). As a control for pre-selection four additional 
libraries were created but without selection with the generic ligands Protein A and 
5 Protein L: insert library sizes for these libraries were 1.29 x 10^ ("primary" NNK), 
2.40 X 108 ("primary" DVT), 1.16 x 10^ ("somatic" NNK) and 2.17 x lO^ ("somatic" 
DVT). - 

To verify the success of the pre-selection step, DNA from the selected and unselected 
10 "primary" NNK libraries was cloned into a pUC based expression vector and 
electroporated into HB2151. 96 clones were picked at random from each recloned 
library and induced for expression of soluble scFv fragments. Production of functional 
scFv is assayed by ELISA using Protein L to capture the scFv and then Protein A-HRP 
conjugate to detect binding. Only scFv which express functional Vh and V^ domains 
15 (no frame-shifts, stop codons» folding or expression mutations) will give a signal using 
this assay. The ninnber of functional antibodies in each library (ELISA signals above 
background) was 5% with the unselected "primary" NNK library and 75% with the 
selected version of the same (Figure 3). Sequencing of clones which were negative in 
the assay confirmed the presence of frame-shifts, stop codons, PCR mutations at critical 
20 framework residues and amino acids in the antigen binding site which must prevent 
folding and/or expression. 

Example 3 

Library selection against target ligands 

25 

The "primary" and "somatic" NNK libraries (without pre-selection) were separately 
selected using five antigens (bovine ubiquitm, rat BIP, bovine histone, NIP-BSA and 
hen egg lysozy me) coated on immunotubes at various concentrations. After 2-4 rounds 
of selection, highly specific antibodies were obtained to all antigens except hen egg 
30 lysozyme. Clones were selected at random for sequencing demonstrating a range of 
antibodies to each antigen (Figure 4). 

In the second phase, phage from the pre-selected NNK and DVT libraries were mixed 
1:1 to create a single "primary" library and a single "somatic" library. These libraries 
35 were then separately selected using seven antigens (FITC-BSA, human leptin, human 
thyroglobulin, BSA, hen egg lysozyme, mouse IgG and htrnian IgG) coated on 
immunotubes at various concentrations. After 2-4 rounds of selection, highly specific 
antibodies were obtained to all the antigens, including hen egg lysozyme which failed to 
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produce positives in the previous phase of selection using the libraries that had not been 
p^-selected usmg the generic ligands. Clones were selected at random for sequencing 
demonstratmg a range of different antibodies to each antigen (Figure 4). 

Example 4 

Eff«:t of pr^elecaoo .n scFv expression and p™tocaon of phage bearing scFv 

seteaed p™^. DVT libraries Is cloned inro a pUC based expression ve«or and 

""T'"'- ^"'""'^ '"^ '"'^ cases. 96 Lnes are piL^ 

r»dom ft^n each recloned library and induced for expression of soluble scFv 

SCFV followed by the use of Pro.ein A-mP ■„ detect bound scFv. The percentage of 
ft^tonal annbodies in each library is 35.4% (nnseleced) and 84.4% (pr sS 
u,g a 2.4 foM increase in .he ™nnber of iunctional members as a Lit ^ 

equn-alent NNK bbrary smce the DVT codon does no. encode the TAG stop codon In 
^ unselected NNK Ubrary. d« presence of a TAG stop codon in a ^s^l^ 

.^ression. Pre-selecuon of the NNK library remo.es clones containing TAG s«>d 
codons to ptoduce a library h. which a high proponion of members ex^Jf so«e 

'° *f "'^^ "P"™"^- libr^ on total scFv 

c" anucrr "^'""^ ''-^'"^ (eachluining 

fi^gmems. The concentranon of expressed scF, in the supernatant is then detennined 
by mcubatmg two fold dilutions (columns 1 -12 in Fieure 5=> „f ^"""^ 
^otein . coated HUSA plate, foUowed by 

Srrr^ P^-ected DVT libraries. Tlese are used to plot a standan. curve 

a sTZ ""'"^ ►'^■^ ^ t-S/ml respecuvely i e 

5.2 fold mcrease m expression due to pre-selecUon with Protein A and Protein L ' ' 

DVT hbranes are grown and polyclonal phage is produced. Equal volumes of^^e 
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from the two libraries are run under denaturing conditions on a 4-12% Bis-Tris 
NuPAGE Gel with MES running buffer. The resulting gel is western blotted, probed 
using an anti-pIII antibody and exposed to X-ray film (Figure 6). The lower band in 
each case corresponds to pin protein alone, whilst the higher band contains the pIII- 
5 scFv fusion protein; Quantification of the band intensities using the software package 
NIH image indicates that pre-selection results in an 11.8 fold increase in the amount of 
fusion protein present in the phage. Indeed, 43% of the total pUI in the pre-selected 
phage exists as pIII-scFv fusion, suggesting that most phage particles will have at least 
one scFv displayed on the surface. 

10 

Hence, not only does pre-selection using generic ligands enable enrichment of 
functional members from a repertoire but it also leads to preferential selection of those 
members which are well expressed and (if required) are able to elicit a high level of 
display on the surface of phage without being cleaved by bacterial proteases. 
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Claims 

1. A method for selecting, from a repertoire of polypeptides, a population of 
fimctional polypeptides which bind a target ligand in a first binding site and a generic 
5 ligand in a second binding site, which generic ligand is enable of binding functional 
members of the repertoire regardless of target ligand specificity, comprising the steps of: 

. a) contacting the repertoire with the generic ligand and selecting functional 
polypeptides bound thereto; and 

b) contacting the selected functional polypeptides with the target ligand and 
10 selecting a population of polypeptides which bind to the target ligand. 



2. A method according to claim 1 wherein the repertoire of polypeptides is first 
contacted with the target ligand and then with the generic ligand. . 

15 

3. A method according to claim 1 or 2 wherein the generic ligand binds a subset of 
the repertoire of polypeptides. 

4. A method according to claim 3 wherein two or more subsets are selected from 
20 the repertoire of polypeptides. 

5. A method according to claim 4 wherein the selection is performed with two or 
more generic ligands. 

25 6. A method according to claims 4 or 5 wherein the two or more subsets are 
combined after selection to produce a further repertoire of polypeptides. 

7. A method according to any preceding claim wherein two or more repertoires of 
polypeptides are contacted with generic ligands and the subsets of polypeptides thereby 

30 obtained are then combinjed. 

8. A method according to any preceding claim, wherein the polypeptides of the 
repertoire are of the immunoglobulin superfamily. 

35 9. A method according to claim 8, wherein the polypeptides are antibody or T-cell 
receptor polypeptides. 
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) 



12^ A m«hod Wherein a reperroire of polypeptides aceordtag to elate 10 and a 

Z^:L^TT r"""' """^"^ generle™^!: 
me subsets thereby obtained are then combined. 

"leaed'ZT ""^^ ''"ta Wherein U.e gen^ic ligand is 

""^ ' population and a 

clanns 1 to 13, coi„pru.„,g binding the members to the generic ligand. 



15. 



A library wherein the functional members have binding sites for borh a. • 
and target ligands. '^"lurag sues tor both genenc 

16. A library designed for selection witi, bod, generic and Urge, ligands. 

^.or'p!';::^ " - - 
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21. A library wherein a repertoire of polypeptides according to claim 19 and a 
repertoire of polypeptides according to claim 20 are contacted with generic ligands and 
the subsets thereby obtained are then pooled. 

5 22. A library according to any one of claims 15 to 21, wherein the functional 
members of the repertoire have a known main-chain conformation. 

23. A library according to claim 22, wherein the functional members of the 
repertoire have a single main-chain conformation. 

10 

24. A library according to claims 22 or 23, wherein the immunoglobulin scaffold is 
based on germline V gene segment sequences. 

25. A library according to any one of claims 15 to 24, wherein the polypeptides are 
15 varied at random positions. 

26. A library according to any one of claims 15 to 24, wherein the polypeptides are 
varied at selected positions. 

20 27. A library according to claim 26, wherein the selected positions are those which 
form the binding site for the target ligand. 

28. A library according to claim 27, wherein the selected positions are a subset of 
those which form the binding site for the target ligand. 

25 

29. A library wherein a repertoire of polypeptides according to claim 28 is first 
contacted with a target ligand in order to isolate a subset of polypeptides specific for the 
target ligand, the subset of polypeptides then being varied at a further subset of residues 
in order to modify the function, specificity or affinity of target ligand interaction. 

30 

30. A library according to claims 26-29, wherein the variation is achieved by 
incorporating all 20 different amino acids at the positions to be varied. 

31. A library according to claim 26-29, wherein the variation is achieved by 
35 incorporating some but not all of the 20 different amino acids at the positions to be 

varied 
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